Abstract
Whole-genome duplication (WGD), or polyploidy, events are widespread and significant in the evolutionary history of angiosperms. However, empirical evidence for rediploidization, the major process where polyploids give rise to diploid descendants, is still lacking at the genomic level. Here we present chromosome-scale genomes of the mangrove tree Sonneratia alba and the related inland plant Lagerstroemia speciosa. Their common ancestor has experienced a whole-genome triplication (WGT) approximately 64 million years ago coinciding with a period of dramatic global climate change. Sonneratia, adapting mangrove habitats, experienced extensive chromosome rearrangements post-WGT. We observe the WGT retentions display sequence and expression divergence, suggesting potential neo- and sub-functionalization. Strong selection acting on three-copy retentions indicates adaptive value in response to new environments. To elucidate the role of ploidy changes in genome evolution, we improve a model of the polyploidization–rediploidization process based on genomic evidence, contributing to the understanding of adaptive evolution during climate change.
Subject terms: Plant evolution, Evolutionary biology, Genome evolution, Evolutionary genetics
Polyploidization-rediploidization process plays an important role in plant adaptive evolution. Here, the authors assemble the genomes of mangrove species Sonneratia alba and its inland relative Lagerstroemia speciosa, and reveal genomic evidence for rediploidization and adaptive evolution after the whole-genome triplication.
Introduction
The origin and radiation of flowering plants (angiosperms) in the mid-Cretaceous was famously referred to by Charles Darwin as “an abominable mystery”1,2. Presently, angiosperms encompass over 90% of all living plant species, with approximately 350,000 known species, making them the most successful land plants on Earth (www.theplantlist.org). There is growing consensus that whole-genome duplication (WGD) events, also known as polyploidy, have played a widespread and significant role in the evolutionary history of angiosperms3–9. Early WGD events in plants can be traced back to the common ancestors of extant seed plants and angiosperms, respectively10. Furthermore, core eudicots, a major clade within angiosperms, experienced a well-known paleo-hexaploidization event11. WGDs have also occurred in various lineages, even recurrently, like Arabidopsis12,13, soybean14, carrot15, and Utricularia16. One specific type of WGD, known as whole-genome triplication (WGT) or hexaploidy, originated through hybridization between tetraploid and diploid species17–19. Genomic data have revealed at least 18 independent WGT events in eudicots11,17,20–34, indicating a prevalence higher than previously assumed (Supplementary Fig. 1). Despite the challenges that have emerged since the WGD, such as stable chromosome segregation, detrimental ecological interactions with diploid progenitors, and minority cytotype exclusion35,36, the polyploidy events observed in plants highlight their evolutionary potential. Experimental and simulation studies have supported the adaptive potential of polyploidy, especially in the face of dramatic and fluctuating environmental conditions37–39. Overall, polyploidy has been recognized as a major driving force behind evolutionary adaptation and diversification4,5.
Plants have experienced periods of global climate change, and genomic resources offer an opportunity to better understand the dynamics of plant evolution during such global climate changes40–42. The relationship between WGD and the success of plant lineages is an intriguing topic43–46. Previous studies have revealed several instances of WGD occurring independently during three periods of climatic instability and environmental perturbations: the Early Cretaceous around 120 million years ago (Mya)47, the K-Pg boundary around 65 Mya48, and the Miocene–Pliocene (<20 Mya)9,49. These WGD events may have provided a buffer for plants and facilitated their survival and adaptation to rapidly changing environments by increasing genomic plasticity and genotypic combinations.
Mangroves have successfully adapted to extreme intertidal zones, bridging terrestrial and marine ecosystems, evolving a series of adaptive traits, such as salt tolerance, aerial roots, and viviparous embryos50–52. They are attractive ecological model systems to investigate adaptive evolution. Prior to colonizing their new habitat, several mangrove species independently experienced WGD events53–57. Nevertheless, almost all mangrove species are currently considered diploids (Supplementary Data 1), indicating the importance of the rediploidization process in ancient polyploids. Rediploidization involves redundancy reduction, coordination of subgenomic function, and chromosome fractionations, ultimately leading to the establishment of modern diploid descendants cytogenetically and potentially contributing to plant adaptation58,59. Despite its significance, the rediploidization process in ancient polyploid plants remains poorly understood. With advancements in genome sequencing and assembly technologies, high-quality chromosome-scale genomes provided an opportunity to reconstruct ancestral genomes and infer the trajectory of plant genome evolution60,61. We can now explore the process of rediploidization following polyploidization on a genomic scale.
In this work, we present two chromosome-scale genomes of Lythraceae plants: the mangrove tree Sonneratia alba (Supplementary Fig. 2) and related inland plant Lagerstroemia speciosa (Supplementary Fig. 3), as a part of the worldwide mangrove genomes project62. Through comprehensive analyses, we trace the evolutionary history of genomes and investigate the polyploidization–rediploidization process and its implications for adaptive evolution in the face of global climate change.
Results and discussion
Genome sequencing, assembly, and annotation
We first utilized high-throughput chromosome conformation capture (Hi-C) technology to improve the genome of S. alba. This improvement builds upon our prior study utilizing PacBio Single-Molecule Real-Time (SMRT) sequencing and Illumina short reads sequencing54, resulting in a chromosome-scale assembly (Supplementary Table 1). The assembled genome derived from anchored contigs was 204.46 Mb, aligning closely with the genome size estimated through k-mer-based analysis (211.67 Mb). It comprised 12 chromosomes (97.60% of all sequences) and 40 unanchored scaffolds. The N50 value notably increased from 5.52 Mb to 15.69 Mb (Table 1). Additionally, we de novo assembled the genome of the closely related inland woody plant L. speciosa by incorporating high-depth PacBio SMRT sequencing, Illumina short reads sequencing, and Hi-C technologies (Supplementary Table 1). The assembled genome of L. speciosa was 319.66 Mb, with an N50 value reaching 12.74 Mb, consistent with the estimated genome size (361 Mb by flow cytometry and 340.46 Mb by k-mer-based analysis). It comprised 24 chromosomes, encompassing 98.08% of all sequences (Table 1). The assembled genomes of S. alba and L. speciosa both showed high congruence because of their strongest interaction signals from the Hi-C data clustered at the expected diagonal region (Fig. 1a and Supplementary Fig. 4). The gene prediction process involved a comprehensive approach, combining ab initio, homology-based and RNA-seq-assisted strategies. The integration of these predictions through EvidenceModeler resulted in the identification of non-redundant and consensus gene models for the S. alba and L. speciosa genomes (see Methods for details). This unveiled a total of 25,284 (Supplementary Fig. 5) and 30,497 (Supplementary Fig. 6) protein-coding genes, respectively, characterized by high completeness (Supplementary Table 2). Moreover, 99.38% and 99.43% of them were categorized into chromosomes, respectively. The presence of syntenic blocks between the two genomes further supported their quality as chromosome-scale assemblies (Fig. 1b). These high-quality genomes can supplant earlier assemblies, serving as valuable references for genomic and evolutionary studies in plants (Supplementary Note 1).
Table 1.
Genome features | Sonneratia alba | Lagerstroemia speciosa |
---|---|---|
Sequencing methods | Illumina + PacBio + HiC | Illumina + PacBio + HiC |
Sequencing reads | 32.99 Gb + 28.36 Gb + 103.88 Gb | 41.16 Gb + 95.60 Gb + 54.19 Gb |
Assembled genome size | 204.46 Mb | 319.66 Mb |
Anchored size | 199.55 Mb (97.60%) | 313.51 Mb (98.08%) |
Anchored gene number | 25,126 | 30,323 |
GC content | 41.77% | 40.40% |
Number of chromosomes | 2n = 24 | 2n = 48 |
Number of scaffolds | 52 | 629 |
N50 length | 15.69 Mb | 12.74 Mb |
N90 length | 12.96 Mb | 10.54 Mb |
Longest sequence length | 22.93 Mb | 17.34 Mb |
Gap content | 0.05% | 0.02% |
Less TE accumulation in the mangrove
The mangrove species have small genome sizes compared with inland relatives63. Repetitive sequences are the primary determinant of plant genome size, and transposable elements (TEs) are the predominant components of repetitive elements64,65. First, we observed that S. alba has fewer chromosomes compared to L. speciosa (Fig. 1b). We estimated that 20.95% (43 Mb) of the S. alba genome consists of TE sequences, while 36.50% (117 Mb) in the L. speciosa genome and higher TE contents in other relatives (Supplementary Fig. 7 and Supplementary Table 3). The long terminal repeat retrotransposons (LTR-RTs), typical class I TEs, usually have much copy number and large size in plant genomes, contributing significantly to genome size growth66. The intact LTR-RTs were further classified as Copia and Gypsy element families, and their insertion time distributions were examined. We found that S. alba has much lower recent LTR-RT insertion rates than relatives in Myrtales, especially in the Copia element family (Supplementary Fig. 8). Overall, the mangrove species S. alba maintains a smaller genome size, fewer chromosomes, lower accumulation of TEs, and a reduced rate of LTR-RT insertion, resulting in a more simplified genome.
WGT coinciding with dramatic global climate change
With the availability of chromosome-scale reference genomes in Lythraceae, we revisited the origin of Sonneratia, the significant taxon within the mangrove ecosystem. We reconstructed the phylogeny among three Lythraceae species (S. alba, L. speciosa, and Punica granatum) and four other species (Eucalyptus grandis, Arabidopsis thaliana, Vitis vinifera, and Nelumbo nucifera) with available pseudo-chromosome scale genome data. The tree topology was inferred using RAxML-NG with the GTR + GAMMA + I model based on 1,963 orthologous single-copy gene groups (Supplementary Fig. 9), and the divergence time was estimated using MCMCTREE from the PAML package with two reliable calibrations (see Methods for details). The divergence times were consistent with a previous study (Supplementary Table 4)67. Additionally, our estimation suggests that the mangrove S. alba diverged from the closely related inland woody plant L. speciosa around 57.79 Mya, while the common ancestor of them diverged from the same family plant P. granatum around 67.82 Mya (Fig. 2a). We further constructed a larger-scale phylogenetic tree, incorporating 42 sequenced angiosperms along with the gymnosperm Gnetum montanum (as an outgroup), to reflect the positions of these plants within Lythraceae (Supplementary Fig. 10).
Whole-genome duplication (WGD), or polyploidy, events have played a significant role in the evolutionary history of angiosperms, aiding in their survival during periods of dramatic environmental changes4,9,43. WGD events can provide a substantial amount of genetic material for adaptation. In this study, we utilized a combination of synteny, Ks-base, and phylogenetic approaches (Supplementary Fig. 11) to confirm that S. alba and L. speciosa underwent a whole-genome triplication (WGT) event prior to their divergence from a common ancestor (Fig. 2a). Initially, we scanned the genomes of three Lythraceae plants, namely S. alba, L. speciosa, and P. granatum, using BLASTP and MCScanX. We identified 164 syntenic block pairs comprising 5,999 gene pairs in S. alba; 486 syntenic block pairs comprising 12,180 gene pairs in L. speciosa; and 219 syntenic block pairs comprising 3,333 gene pairs in P. granatum. The presence of extensive syntenic block pairs indicated past polyploidy events. Subsequently, we calculated synonymous substitution rates (Ks) between paralogous genes in each genome. The Ks distribution revealed recent peaks in S. alba and L. speciosa, but not in P. granatum (Fig. 2b), suggesting that P. granatum did not experience the polyploidy events. Within the Ks peaks range, we identified 584 three-copy retention groups in S. alba and 943 in L. speciosa (Supplementary Figs. 12–14), indicating that the polyploidy event in these species was a hexaploidy (whole-genome triplication, WGT) event. This finding was further supported by genome-wide syntenic regions between S. alba and P. granatum, as well as L. speciosa, and P. granatum (Fig. 2c and Supplementary Fig. 15). Furthermore, we presented an expected signature of the whole-genome triplication event through collinear genes in the modern genome (Supplementary Fig. 16). While the Ks peak appears slightly different between S. alba and L. speciosa, we performed gene tree reconstructions of the syntenic gene groups and confirmed that the WGT event occurred prior to the divergence between Sonneratia and Lagerstroemia (Fig. 2d). The placement of the WGT event was also validated using the multi-taxon paleopolyploidy search (MAPS) analysis and corresponding simulations (Supplementary Fig. 17). This multipronged approach allows us to overcome the challenges posed by divergent evolutionary rates in different plants, enabling the identification of more accurate features and positions of polyploidy events23,68–70.
Extrapolating from the divergence time in Lythraceae, we further estimated that the shared WGT event of Sonneratia and Lagerstroemia occurred around 64 Mya (Fig. 2a, see Methods for details), slightly after the divergence from P. granatum. This WGT event coincided with a brief period of dramatic global climate change resulting from a large asteroid collision with the Earth, known as the Cretaceous-Paleogene (K-Pg) boundary, which took place around 66 Mya48. Polyploidy events play a significant role in reshaping gene regulatory networks in response to environmental stresses9,71. A series of ancient WGD events occurred independently in numerous plant lineages around the K-Pg boundary43,45,49. These events served as a buffer for plants, enhancing their ability to survive and adapt to rapidly changing environments by increasing genomic plasticity and generating diverse genotypic combinations. We suggest that the WGT events may have contributed to the survival of plants during the extinction event. Not only that, at approximately 55 Mya, there was a significant global temperature increase (warming by ∼6 °C within ∼20,000 years) and a rise in eustatic sea levels, known as the Paleocene-Eocene Thermal Maximum (PETM)72. The combination of sea level rise, mass extinction, and the WGT event potentially provided an opportunity in environmental and genetic aspects for offshore woody plants to develop a series of highly specialized traits (such as salt tolerance and aerial roots) to survive, leading to the emergence of the mangrove Sonneratia.
Genome evolution is a long-term and dynamic process. Early WGD events (ζ, ε, γ) occurred hundreds of millions of years ago10,11 and their corresponding collinearity has faded with time or been influenced by subsequent WGD events. Plants that have undergone recent WGD events within the past 20 million years still possess numerous redundant regions in the genomes. Therefore, the WGT event in Sonneratia (~64 Mya) provides a valuable opportunity to study the polyploidization–rediploidization process in angiosperms (Fig. 2a). Furthermore, by integrating appropriate genomic data, we positioned the WGT event within a narrow time window between two close speciation events, whose pattern is similar to the γ-WGT event associated with the early diversification of core eudicots.
Chromosome evolution following the WGT event
Many ancient polyploidy events have been followed by striking reductions in duplicated redundancy and chromosome number59. For example, Utricularia gibba, despite having a small plant genome, has a haploid chromosome number (n) of 14, yet it has undergone three WGD events since the well-known γ event shared by core eudicots16. If we exclusively consider polyploidy, the haploid chromosome number of Utricularia gibba would be 7 × 3 × 2 × 2 × 2 or n = 168, based on the ancestral chromosome number (n = 7) before experiencing γ-WGT event73. Conducting a chromosome-scale comparative investigation among the Lythraceae plants allowed us to explore the paleo-history following the WGT event. Our analysis inferred that the chromosome number of the common ancestor of Sonneratia and Lagerstroemia is n = 24 (post-WGT) and n = 8 (pre-WGT) (Supplementary Fig. 18). Additionally, the chromosome number of the common ancestor of the three Lythraceae plants is n = 8, which is the same as the chromosome number in P. granatum.
To gain further insights into the evolutionary history of chromosomes, we reconstructed the ancestral Lythraceae karyotype (ALK) using WGDI based on adjacent conserved collinear blocks. Our evolutionary scenario suggests that the ALK of S. alba, L. speciosa, and P. granatum genomes consisted of eight proto-chromosomes with 18,885 proto-genes. As shown in Fig. 2e, the ancestor underwent a WGT event and subsequently experienced chromosomal rearrangements to attain their modern genome structure. The chromosome origin of S. alba appears more intricate than that of L. speciosa. S. alba’s chromosomes underwent a greater number of fission and fusion events compared to L. speciosa, although intra-chromosomal inversions were common in the chromosome histories of both species (Fig. 2e and Supplementary Fig. 19). Due to the complexity of chromosome evolutionary history in S. alba, we illustrated it using reciprocally translocated chromosome arms (RTA), end-to-end joining (EEJ), nested chromosome fusion (NCF) events, fission events, and chromosome inversions to depict a probable karyotype evolution (Supplementary Fig. 20).
Although the reconstructed ancestral karyotype is highly likely to possess a structure very similar to the true ancestral genome, it may not be entirely identical60. Furthermore, we performed synteny analysis among the modern genomes of the three Lythraceae species and confirmed numerous chromosome rearrangements (Fig. 2f). In contrast to intra-chromosomal inversions observed in related inland species, S. alba exhibited significant fission and fusion events (Supplementary Fig. 21). These findings indicate that the mangrove species has a reduced number of chromosomes and underwent more chromosomes rearrangements compared to its closely related inland species L. speciosa.
Adaptation through polyploidization–rediploidization cycles
During periods of dramatic environment and climate change, newly formed polyploids can possess fitness advantages over diploids. This is supported by evidence that the persistence of WGD correlates with times of environmental and climate change, suggesting potential benefit for the WGD in the face of challenges4,35,45,74–77. Nevertheless, polyploids may also face substantial disadvantages, including redundant components, gene dosage imbalance, increased replication and metabolic costs, cellular mismanagement, and a higher propensity for polyploid mitosis and meiosis to produce aneuploid cells35,58,77,78. Despite these immediate challenges, some polyploid lineages have persisted and even thrived79,80. As climatic conditions stabilize and environmental conditions improve, polyploids may experience reduced fitness compared to diploids due to the accumulation of genetic load, increased mutational load, slower positive selection, and reduced growth rates35,37,81,82. Therefore, the process of rediploidization following polyploidization may be inevitable for polyploids, ultimately leading to modern descendants as normal diploids cytogenetically, generating important genetic and taxonomic diversity. In fact, nearly all angiosperms have undergone successive rounds of polyploidization and rediploidization process (Supplementary Fig. 22)4,10,11,83. Considering the potential role of ploidy changes in genome evolution, we improve a model based on genomic evidence and the previous studies35,58,59,77,84,85. This model explains the polyploidization–rediploidization process, elucidating the adaptive evolution during global upheavals and restoration (Fig. 3 and Supplementary Data 2). Specifically, rediploidization through redundancy reduction, gene divergence and chromosome rearrangement confers advantages, such as shortening DNA replication and the cell cycle, and reducing recombination of locally adapted alleles, thereby facilitating the survival of the mangrove in barren intertidal zones.
Divergence of WGT retained genes in the mangrove genome
The differentiation of retained genes plays a crucial role in reducing gene redundancy and serves as a primary genetic basis for genome evolution. We observed that paralogous gene pairs generated by the WGT event in S. alba exhibited higher genetic divergences (Supplementary Fig. 23) and Ks values (Fig. 2b), indicating sequence differentiation. Besides the sequence divergence, expression divergence is also important. Therefore, we conducted transcriptome sequencing of leaf, root, flower, and fruit tissues of S. alba (Supplementary Fig. 24 and Supplementary Table 5) and employed the exact conditional test to investigate the expression divergence of WGT retained genes. We identified that approximately 58.04% to 64.57% of the paralogous gene pairs generated by the WGT were differentially expressed across these four tissues in the mangrove species (Supplementary Table 6). The different tissues harbored a similar number of differentially expressed gene pairs (DEGPs), with slightly higher numbers in leaf tissue and lower numbers in fruit tissue. These differentially expressed pairs belonged to 481–516 three-copy retention groups and 1937–2136 two-copy retention groups in different tissues (Supplementary Table 6). Moreover, we identified 1,789 gene pairs that showed differential expression across all four tissues (Supplementary Fig. 25). To investigate the functional roles of these DEGPs, we performed gene ontology (GO) enrichment analysis. The DEGPs in different tissues were predominantly enriched in the metabolic process and catalytic activity GO categories, while the DEGPs shared across all four tissues were enriched in more specific GO categories related to metabolic process, gene expression, biosynthetic process, mitochondrial envelope, and catalytic activity (Supplementary Data 3), which are critical for plant growth and adaptation. Similarly, we explored the expression divergence of WGT retained genes in the closely related inland plant L. speciosa. We also identified that around 60% of the paralogous gene pairs resulting from the WGT exhibited differential expression across four tissues in the related species (Supplementary Tables 7, 8), mirroring findings in the mangrove species. These results suggest the potential neo- and sub-functionalization of the retained genes following the polyploidization–rediploidization process.
Strong selection in WGT retained gene groups
Polyploidy is widely recognized as a major source of novel genetic material, which can undergo mutation and selection to give rise to new or specialized functions to aid adaptation86,87. To assess the impact of both negative and positive selection on sites located in different-copy (one-, two-, three-copy) retention groups in the mangrove species, we used the DFE-alpha approach to estimate the distribution of fitness effects (DFE) of new mutations and the proportion of adaptive divergence (α)88–91, based on the folded site frequency spectrum (SFS) and divergence between S. alba and S. apetala. We also estimated constraint and selection effects using SnIPRE92, as well as fixation index (FI)93 for genes belong to different-copy retention groups to demonstrate the strength of selection (Fig. 4 and Supplementary Fig. 26). We observed that the one-copy retention group exhibited a lower proportion of strongly deleterious mutations (Nes < −100) and a higher proportion of slightly deleterious mutations (−1 < Nes < 0) compared to other copy retention groups (Fig. 4a and Supplementary Fig. 26a). The proportion of adaptive divergence (α) at zero-fold nonsynonymous sites and neutral divergence (ωa) were both higher in the three-copy retention group (Fig. 4a and Supplementary Fig. 26a). They suggested that the strength of negative selection and positive selection were increased in turn from the one-copy to the three-copy retention groups, consistent with estimates of FI (Fig. 4b and Supplementary Fig. 26b). Furthermore, the three-copy retention group demonstrated a lower constraint effect and a higher selection effect compared to other copy retentions (Fig. 4c, d and Supplementary Fig. 26c, d). Collectively, these results indicate preferential retentions of three-copy genes following the polyploidization–rediploidization process, driven by strong selection and possessing potential adaptive value in response to new environments.
WGT retained genes for root development and salt tolerance
Mangrove species live in environments characterized by high salinity and waterlogging, which pose challenges to plant growth and productivity94,95. The special root systems and high salt tolerance observed in mangroves are particularly noteworthy. Sonneratia alba, a prevalent and salt-tolerant mangrove species found in low intertidal zones, has evolved specialized structures like pneumatophores to enhance its waterlogging tolerance51. Following the WGT event, duplicated genes are often rapidly lost, while retained duplicates potentially changing expression or acquiring new functions serve as important sources of evolutionary innovation and aid in survival within the newly acquired habitat56,57,79,96. Therefore, we conducted functional analyses among the retained genes, which encompassed GO enrichment (Supplementary Fig. 27 and Supplementary Note 2) and gene function assessments based on annotations. Our focus was particularly directed toward the 584 three-copy retention groups generated by the WGT event. Several gene groups involved in auxin distribution regulation, auxin signal transduction, reactive oxygen species (ROS) scavenging, ion transport, salt overly sensitive (SOS) signaling pathway, abscisic acid (ABA) signaling pathway, and transcriptional regulation retained the three duplicates ultimately (Supplementary Data 4). Interestingly, genetic and physiological experiments have demonstrated that salt modulates root growth direction by causing asymmetric auxin distribution and impairing the gravity response. In response to high salt levels, the SOS signaling pathway mediates the rapid degradation of amyloplasts in root columella cells, leading to the loss of root gravitropism97,98. PIN and ABCB genes, coding transporters polarly localized at the plasma membrane, promote auxin efflux activity99,100, while PP2A proteins also influence PIN localization and participate in the regulation of auxin distribution101. Peroxidase protein-coding genes (POD) play a role in reducing ROS level, thereby preventing ROS from catalyzing auxin oxidation. These mechanisms likely together facilitate the development of erect lateral branches in horizontal roots and shape the pneumatophores of S. alba, enhancing its waterlogging tolerance (Fig. 5).
Furthermore, we integrated transcriptomes by salt gradient experimental treatments to elucidate the mechanism underlying salt tolerance in S. alba. Using the HISAT2-HTseq-DESeq2 workflow, we examined expression profiles and identified differentially expressed genes (Supplementary Fig. 24). We observed 83 three-copy retention groups and 283 two-copy retention pairs with at least one copy showing up-regulation in leaf or root tissues under high salt conditions (Supplementary Table 9). We noticed that a subset of these genes, particularly the three-copy retentions, were associated with the phytohormone abscisic acid (ABA) (Fig. 5), including the ABA pathway, ABA transport, and other related processes (Supplementary Fig. 28). In detail, the release of Ca2+ in response to high salt triggers ABA biosynthesis102. Proteins such as PP2C and ABI5 function in the core ABA signaling pathway and regulate downstream stress response genes, including late embryogenesis abundant (LEA)103. The expression of two LEA genes, Sal009147 and Sal011573, was found to increase across salinity in leaf tissue. These hydrophilic and heat-stable proteins, with biased amino acid compositions, can sequester accumulated ions within cells and act as chaperones to prevent protein aggregation and inactivation104,105. Transcription factors have the potential to regulate multiple aspects of salt adaptation, with MYB and ERF positively influencing genes such as ABI5 and LEA, and MYB2 and MYC2 acting as transcriptional activators in ABA-inducible gene expression106. Thus, the up-regulated retained genes can enhance plant desiccation and salt tolerance, contributing to adaptation in intertidal zones.
In summary, we successfully constructed chromosome-scale genomes for two Lythraceae plants, S. alba and L. speciosa, by combining PacBio SMRT sequencing, short reads sequencing, and Hi-C technologies. Based on genomic evidence and the previous studies35,58,59,77,84,85, we report an improved model of the polyploidization–rediploidization process in plants, shedding light on adaptive evolution during periods of global climate change. Our findings revealed that S. alba and L. speciosa underwent a WGT event at approximately 64 Mya, which coincided with the K-Pg boundary. Subsequently, the mangrove tree experienced extensive chromosomal rearrangements and fractionations, leading to its modern genome structure. We further discovered that the retained duplicates from the WGT event in S. alba exhibited not only sequence divergence but also significant expression divergence, which is a crucial mechanism for rediploidization. Overall, our study contributes valuable insights into the plant evolution.
Methods
Plant materials
We sampled mature specimens of Sonneratia alba (Supplementary Fig. 2) and Lagerstroemia speciosa (Supplementary Fig. 3) from the nursery of Dongzhai Harbor National Nature Reserve in Haikou and Sun Yat-sen University in Guangzhou with proper permission, respectively. Fresh and healthy tissues were carefully collected and rapidly frozen in liquid nitrogen. Subsequently, the samples were stored at −80 °C in the laboratory until DNA or RNA extraction was performed.
Library construction and sequencing
High-molecular-weight (HMW) genomic DNA was isolated from L. speciosa leaf tissue using the CTAB (hexadecyltrimethylammonium bromide) method107 for both PacBio Single-Molecule Real-Time (SMRT) long-read sequencing and Illumina short-read sequencing. A PacBio SMRT-bell library was prepared with 10 kb long inserts following the manufacturer’s protocol and subsequently sequenced on a PacBio Sequel II platform (Pacific Biosciences, Menlo Park, CA, USA). The generated PacBio reads underwent data filtering and preprocessing, resulting in 9.57 million reads, corresponding to approximately 95.60 Gb of data and ~299X coverage (assuming a genome size of 320 Mb, Supplementary Table 1). The same batch of genomic DNA was fragmented using sonication to construct a short-insert paired-end library with 500 bp inserts. This library was sequenced on an Illumina HiSeq X Ten platform (San Diego, CA, USA), producing 41.16 Gb of data (Supplementary Table 1).
To facilitate gene prediction, total RNA was extracted from leaves of L. speciosa using the TRIzol universal reagent (Invitrogen) according to the manufacturer’s instructions. The resulting RNA-seq library was sequenced on an Illumina HiSeq X Ten platform (San Diego, CA, USA). Furthermore, total RNA for the expression atlas of S. alba was extracted from leaf, root, flower, and fruit tissues of mature plants in the Dongzhai Harbor National Nature Reserve nursery. Each tissue contains three independent biological replicates. RNA-seq libraries were prepared for sequencing on an Illumina HiSeq 2500 platform (San Diego, CA, USA), generating 150 bp paired-end reads. The RNA-seq reads yielded a total of 76.76 Gb of data (Supplementary Table 5).
For Hi-C library construction, tender leaves of both S. alba and L. speciosa were subjected to formaldehyde fixation and subsequent lysis. The cross-linked DNA was digested with MboI, and the resulting restriction fragment ends were biotinylated and ligated. The purified DNA was then physically sheared to an approximate length of 400 bp. The Hi-C library of L. speciosa was sequenced on an Illumina NovaSeq 6000 platform (San Diego, CA, USA), while the Hi-C library of S. alba was sequenced on a BGISEQ-500 platform (Shenzhen, China).
Genome assembly
We reported the genome assembly of L. speciosa and improved the previous assembly of S. alba54. The genome size of S. alba was estimated to be 211.67 Mb (Supplementary Fig. 29) through k-mer-based analysis108. The genome size of L. speciosa was initially estimated using flow cytometry and k-mer-based analysis. The flow cytometry measurement indicated a size of 361 Mb, consistent with the k-mer-based estimation of 340.46 Mb (Supplementary Fig. 30). Then we assembled the de novo genome of L. speciosa based on the PacBio long reads using wtdbg2109 with optimized parameters. To improve the accuracy of the primary assembly, assemblies were further polished with Quiver (SMRT Analysis v2.3.0)110 using long reads. We further removed residual errors using pilon (v1.22) based on Illumina paired-end reads111.
Based on Hi-C data and primary genome assemblies of S. alba and L. speciosa (Supplementary Table 10), we improved them to generate pseudo-chromosome scale genomes, respectively. The Hi-C data underwent quality evaluation and assessment using HiC-Pro112. Subsequently, the Hi-C maps were generated using Juicer113, and the scaffolds were roughly separated using Juicebox114. Manual corrections were made to resolve any misassemblies based on the observed interactions. The validated assemblies were then utilized to construct pseudo-chromosomes using the 3D-DNA tool115. These pseudo-chromosomes provided a chromosome-scale representation of the genomes, enhancing their structural organization and contiguity.
Genome annotations
We identified repetitive sequences in each of the whole genomes using a combination of homology-based and de novo approaches. Initially, known TEs within the genome were identified using RepeatMasker with the Repbase TE library, and RepeatProteinMask searches against the TE protein database were conducted. Subsequently, a de novo repeat library for each genome was constructed using RepeatModeler, allowing for comprehensive analysis, refinement, and classification of consensus models for potential interspersed repeats116. Additionally, a de novo search for long terminal repeat (LTR) retrotransposons in each genome sequence was performed using LTR_FINDER (v1.0.7)117. Tandem repeats were identified using the Tandem Repeat Finder program, while non-interspersed repeat sequences were detected using RepeatMasker. The results from both approaches were integrated, and RepeatMasker was employed to identify the repeat sequences. We also estimated the age structures of long terminal repeat-retrotransposons (LTR-RTs). After de novo prediction of LTR-RTs, we imposed the criterion that an intact LTR-RT must be separated by 1 to 15 kb from other candidates, flanked by a pair of putative LTRs ranging from 100 bp to 3000 bp, with a similarity of over 80%, and possessed a complete Gag-Pol sequence. The timing of LTR-RT insertion was estimated based on the divergence between the 5’-LTR and 3’-LTR of the same transposon, using a mutation rate of 1.3×10−8 substitutions per year per site63.
We conducted gene model prediction of each genome using a combination of ab initio, homology-based, and RNA-seq-assisted prediction. We used Augustus (v3.3.1)118 and GeneMark119 to perform ab initio gene prediction based on the masked genome except for the low complexity or simple repeats, because some of these repeats could be found in the genes. The protein sets were collected for homology-based prediction and chosen as homology-based evidence from sequenced relative plant species and model plant species. Then exonerate (v2.2.0) was used to generate the gene structures based on the homology alignments120. Clean RNA-seq reads were mapped against the assembly using Tophat2 (v2.1.1), and transcripts were identified using Cufflinks (v2.2.1)121,122. Finally, EvidenceModeler was used to integrate all predictions to generate non-redundant and consensus gene models123. Gene functions were annotated based on the best alignment matches to the NCBI (NR), Swissprot, TrEMBL, InterPro, the Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene Ontology (GO), and Pfam non-redundant protein databases. The transcription factor identification was performed using iTAK (v1.7)124. The quality of genome assembly and annotation was assessed using Benchmarking Universal Single-Copy Orthologs (BUSCOv5) with the plant-specific dataset (eudicotyledons_odb10)125.
Phylogeny reconstruction and molecular dating
We downloaded genome and annotation data for Vitis vinifera (Genoscope.12X)73, Arabidopsis thaliana (TAIR10)126, Eucalyptus grandis (v2.0) from Phytozome v12.1 database83; Nelumbo nucifera from Nelumbo Genome Database127; Punica granatum (GCF_007655135.1) from NCBI database128. We used OrthoFinder to identify orthologous genes among S. alba, L. speciosa, and these five eudicot species, resulting in the identification of single-copy gene groups129. For each group, we aligned the corresponding single-copy orthologous proteins and generated codon alignments using MAFFT130 and PAL2NAL131. To ensure data quality, we further applied Gblocks 0.91b132 to trim the alignments and discarded ambiguity alignments shorter than 150 bp. Using the concatenated alignment of these groups, we inferred a phylogenetic tree using RAxML-NG133 with the GTR + GAMMA + I model and performed 1,000 bootstrap replicates. Following its reconstruction, we estimated the divergence time among the seven species using MCMCTREE from the PAML (v4.9j) package with approximate likelihood calculation134,135. The HKY85 + G nucleotide substitution model and independent-rates clock model were employed in the molecular dating. To provide calibration points, we incorporated two reliable fossil calibrations. Firstly, the root node of eudicots was placed at 119.6–128.63 Mya136. Secondly, the common ancestor of Sonneratia and Lagerstroemia was set to a time earlier than 55.8 Mya, since the earliest convincing fossils of Sonneratia-like pollen137. In order to delineate the positions of these plants within Lythraceae, we expanded our analysis by constructing a more extensive phylogenetic tree using these seven plants, other 35 genome-sequenced angiosperms, and the gymnosperm Gnetum montanum as an outgroup (Supplementary Data 5). Utilizing the embryophyta_odb10 lineage ancestral variant dataset (comprising a consensus sequence and variants of extant sequences) in BUSCOv5125, we identified 868 low-copy nuclear genes. We then performed sequence alignment and phylogenetic inference as described earlier. The early divergence times in angiosperms were set to 125–247.2 Mya138,139. All MCMC analyses were independently run twice to ensure convergence, with 10 million generations and sampling every 500 generations after a burn-in of 1,000,000 iterations. The phylogenetic trees were visualized using the R package GGTREE140.
Whole-genome triplication analyses
In order to identify and locate putative WGDs in Lythraceae species, we used a multipronged approach, including the intra- and inter-species synteny analysis, Ks-base estimation, and phylogenetic reconciliation. Initially, we utilized the BLASTP program to align protein sequences between species (P. granatum vs. S. alba, P. granatum vs. L. speciosa) and within species, applying the parameters (identity ≥30%, e-value < 1e−10, alignment length ≥30% of both query and reference sequences). We identified syntenic blocks containing a minimum of five shared genes using MCScanX141, and the resulting syntenic blocks between species were visualized by Circos142. Subsequently, we applied KaKs_Calculator to calculate synonymous substitution rates (Ks) with the YN substitution model143 based on alignments of all syntenic gene pairs and constructed Ks distribution. To identify paralogous genes generated from the WGD event in the S. alba genome, we selected blocks with median Ks values in the range of 0.2-1.0, excluding gene pairs with Ks values larger than 1.26. Using the R package igraph (https://igraph.org), we further classified different-copy retention groups after the WGD event (Supplementary Fig. 12). We also identified different-copy retention groups after the WGD event in L. speciosa using the same workflow (Supplementary Fig. 13). The analyses of synteny and Ks-base indicated that both S. alba and L. speciosa had undergone a whole-genome triplication (WGT) event. Simultaneously, we illustrated the distribution of gene densities for different-copy retention groups in both species (Supplementary Fig. 14). To investigate whether the WGT event was shared between S. alba and L. speciosa, we identified 306 P. granatum genes that possessed three orthologs generated by the WGT in both S. alba and L. speciosa. Then we performed gene tree reconstruction using RAxML-NG and classified the phylogenetic trees based on their topologies.
We also inferred and located the putative WGT placement using the multi-taxon paleopolyploidy search (MAPS) tool144. Clustering gene families among five species, including S. alba, L. speciosa, P. granatum, E. grandis, and A. thaliana, by OrthoFinder, we retained the gene families with at least one gene present in each species. We constructed gene trees based on multiple sequence alignments of each gene family as described above and rooted each tree using Notung (v2.9.1.5)145. By mapping these gene trees to the given species tree, we calculated the percentage of subtrees with gene duplications shared by all species descended from each node using the MAPS tool. To validate the WGT placement, we compared the subtree percentages at each node among observed, null simulated, and positive simulated data and recognized a significant gene duplication burst indicative of a polyploidy event. Background gene birth and death rates were estimated using the R package WGDgc146 and the mean of a geometric distribution of the root was calculated through CAFE analysis147. We performed 2,000 simulated gene tree simulations with 200 bootstrap replicates for both null and positive simulations. In the positive simulation, we designated a polyploidy event in the common ancestor of S. alba and L. speciosa, setting the wgd_retention_rate to 0.2. The observed and simulated data were compared to evaluate the location of the WGT events. The results of the synteny, Ks-based, and phylogenetic analyses consistently indicated that S. alba and L. speciosa had undergone a whole-genome triplication (WGT) event prior to their diverging from a common ancestor.
To determine the absolute timing of the whole-genome duplication (WGT) event, we conducted a molecular clock analysis on concatenated gene families, calibrated using species divergence times148. Specifically, we first identified 208 homologous gene groups among S. alba, L. speciosa, and P. granatum supporting the WGT event before the speciation event (Fig. 2e). For each group, we used the reciprocal BLASTP best-hit method between P. granatum and E. grandis to obtain the corresponding ortholog from E. grandis. Ultimately, we identified 170 gene families that exhibited a clear signal of the WGT event on the common ancestor of S. alba and L. speciosa. In each gene tree, the genes from S. alba and L. speciosa were divided into three clades. To improve the robustness and precision of estimation, we selected the outgroup clade and one of the two ingroup clades randomly to concatenate multiple sequence alignments. The phylogenetic tree was then constructed using RAxML-NG, and the molecular clock analysis was performed using the approximate likelihood calculation method in MCMCTREE under the appropriate model134,135. The nodes were constrained using species divergence times obtained from the phylogenetic tree as described above. Each analysis was independently run twice to ensure convergence.
Chromosome evolution
In order to infer their evolutionary history, we selected representative species in the Lythraceae with chromosome-scale genome assemblies. We first inferred the ancestral chromosome numbers across the phylogenetic tree (Supplementary Fig. 18) using ChromEvol (v2.1)149. The haploid chromosome number (n) of Sonneratia alba was reported by S. Graham150, while the n of other Lythraceae species (Lagerstroemia speciosa, Punica granatum, Pemphis acidula) and outgroup (Eucalyptus grandis) were obtained from the Chromosome Counts Database (CCDB)151. The chromosome number of the most recent common ancestor (MRCA) among S. alba, L. speciosa, and P. granatum was the same as P. granatum (n = 8). We utilized WGDI (v0.6.5) to identify adjacent conserved collinear genes and blocks among all chromosome pairs within the three Lythraceae species, and then reconstructed the Ancestral Lythraceae Karyotype (ALK), excluding interference from fragmented collinear regions, following the tutorial152,153. Subsequently, we visualized the global pattern of chromosomal changes in extant species. Furthermore, we depicted the evolutionary history of S. alba chromosomes to provide a clearer representation of the karyotype evolution152. While the reconstructed ancestral karyotype almost certainly has a very similar structure to the true ancestral genome, it may not be absolutely identical60. We also conducted synteny analysis among the modern genomes of the three Lythraceae species using MCScanX and JCVI to discover chromosome rearrangements154.
Transcriptome sequencing and analysis
RNA-seq reads from the leaf, root, flower, and fruit tissues of S. alba were first filtered using SolexaQA + + (v3.1.7.1)155. Clean reads were aligned to the S. alba genome using HISAT2 (v2.2.0)156. The HTSeq (v0.13.5) was utilized to determine the number of reads uniquely mapped to each gene in the tissue samples157. To detect the differential expression of duplicated genes, we employed the exact conditional test158, which has been successfully applied in soybean, Brassica, and Avicennia159–161. For each pair of duplicated genes, we computed the P-value using the R function binom.test. Multiple testing was corrected by applying the Bonferroni correction method. Differential expression was considered significant for gene pairs with a corrected P-value below 5%. Only gene pairs whose at least one gene read number more than 0 were included in the analysis. We also applied the method for three-copy duplicated gene groups through pairwise comparisons. We finally identified duplicated gene pairs with differential expression for each tissue based on the consistency in three replicates. To investigate the functional roles of these differentially expressed gene pairs, we performed GO enrichment analysis using BiNGO in Cytoscape (v3.7.2)162. Additionally, we performed RNA-seq on leaf, stem, flower, and fruit tissues of L. speciosa (Supplementary Table 7 and Supplementary Fig. 31). We employed the HISAT2–HTSeq–exact conditional test workflow, as described earlier, to identify differentially expressed duplicated gene pairs. Subsequently, we conducted GO enrichment analysis on these gene pairs in L. speciosa (Supplementary Fig. 32).
We also collected the transcriptomes of leaf and root tissues of S. alba under different salinity conditions163. Specifically, S. alba plants were divided into three groups and subjected to irrigation with solutions containing 0, 250, and 500 mM NaCl to simulate low, medium, and high salinity conditions, respectively. After reads mapping and counting, we identified differentially expressed genes (DEGs) from two comparisons (medium vs. low salinity condition, high vs. medium salinity condition) using the R package DESeq2164. The P-value below 5% and fold change greater than two was set as the significantly differential expression threshold.
Distribution of fitness effects of variants
To quantify the impact of selection on different-copy retention groups after the WGT event, we estimated the distribution of fitness effects (DFE) of new mutations and the proportion of adaptive divergence (α) at zero-fold nonsynonymous sites using DFE-alpha88–90. The whole-genome resequencing data from two populations (Cebu, Philippines; Davao, Philippines) of Sonneratia alba (12 individuals per population), and one individual of congeneric species S. apetala were used in the analysis. We first binned protein-coding genes into three subsets according to retained copy numbers (one-, two-, three-copy). We used DFE-alpha to compare the folded site frequency spectrum (SFS) and divergence of zero-fold nonsynonymous sites with those for four-fold synonymous sites. The four-fold synonymous sites were assumed to be neutral. We also estimated DFE and α with 200 bootstrap replicates generated by randomly sampling genes of each subset.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary information
Source data
Acknowledgements
The authors thank Zuyao Liu and Nan Wang for their technical support. The project was supported by the National Natural Science Foundation of China (32170230 to Z.H., 31971540 to Z.H. and 32330005 to S. Shi), the Guangdong Basic and Applied Basic Research Foundation (2023B1515020083 to Z.H.), the Innovation Group Project of Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai) (311021006 to S. Shi), the Shenzhen Science and Technology Innovation Program (RCBS20221008093316043 to Q.C.), and the China Postdoctoral Science Foundation (2023M740690 to X.F.).
Author contributions
Z.H. and S. Shi conceived the study. X.F., S. Shi, and Z.H. designed and conceptualized the study. X.F., Q.C., W.W., J.W., G.L., and Z.H. performed the data analysis. X.F., Q.C., G.L., S.X., C.Z., S. Shi, and Z.H. collected materials. X.F., Q.C., S.X., S. Shao, and M.L. performed the experiments. X.F., Q.C., S. Shi, and Z.H. wrote the manuscript. C.-I.W. revised the manuscript. All authors read and approved the final manuscript.
Peer review
Peer review information
Nature Communications thanks Huan Liu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Data availability
The raw genomic Illumina reads, PacBio reads, Hi-C reads, and RNA-seq reads reported in this paper have been deposited in the Genome Sequence Archive (GSA, https://ngdc.cncb.ac.cn/gsa) in National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences / China National Center for Bioinformation, under accession number CRA004284 with BioProject ID PRJCA005319. The genome assembly sequences have been deposited in the Genome Warehouse (GWH, https://ngdc.cncb.ac.cn/gwh) in National Genomics Data Center under accession number GWHBCIQ00000000 [https://ngdc.cncb.ac.cn/gwh/Assembly/20653/show], GWHBCKL00000000 [https://ngdc.cncb.ac.cn/gwh/Assembly/20692/show] with BioProject ID PRJCA004930 and BioSample ID SAMC353197, SAMC353201. The genome assemblies and annotations are also available at Figshare: Sonneratia alba [10.6084/m9.figshare.25118819], Lagerstroemia speciosa [10.6084/m9.figshare.25118831]. Source data are provided with this paper.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
These authors contributed equally: Xiao Feng, Qipian Chen.
Contributor Information
Suhua Shi, Email: lssssh@mail.sysu.edu.cn.
Ziwen He, Email: heziwen@mail.sysu.edu.cn.
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-024-46080-7.
References
- 1.Darwin, F. & Seward, A. C. More Letters of Charles Darwin (John Murray, 1903).
- 2.Friedman WE. The meaning of Darwin’s ‘abominable mystery’. Am. J. Bot. 2009;96:5–21. doi: 10.3732/ajb.0800150. [DOI] [PubMed] [Google Scholar]
- 3.Wood TE, et al. The frequency of polyploid speciation in vascular plants. Proc. Natl Acad. Sci. 2009;106:13875–13879. doi: 10.1073/pnas.0811575106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Van de Peer Y, Mizrachi E, Marchal K. The evolutionary significance of polyploidy. Nat. Rev. Genet. 2017;18:411–424. doi: 10.1038/nrg.2017.26. [DOI] [PubMed] [Google Scholar]
- 5.Ren R, et al. Widespread whole genome duplications contribute to genome complexity and species diversity in angiosperms. Mol. Plant. 2018;11:414–428. doi: 10.1016/j.molp.2018.01.002. [DOI] [PubMed] [Google Scholar]
- 6.Clark JW, Donoghue PCJ. Whole-genome duplication and plant macroevolution. Trends Plant Sci. 2018;23:933–945. doi: 10.1016/j.tplants.2018.07.006. [DOI] [PubMed] [Google Scholar]
- 7.Rice A, et al. The global biogeography of polyploid plants. Nat. Ecol. Evol. 2019;3:265–273. doi: 10.1038/s41559-018-0787-9. [DOI] [PubMed] [Google Scholar]
- 8.Qiao X, et al. Gene duplication and evolution in recurring polyploidization–diploidization cycles in plants. Genome Biol. 2019;20:38. doi: 10.1186/s13059-019-1650-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Wu S, Han B, Jiao Y. Genetic contribution of paleopolyploidy to adaptive evolution in angiosperms. Mol. Plant. 2020;13:59–71. doi: 10.1016/j.molp.2019.10.012. [DOI] [PubMed] [Google Scholar]
- 10.Jiao Y, et al. Ancestral polyploidy in seed plants and angiosperms. Nature. 2011;473:97–100. doi: 10.1038/nature09916. [DOI] [PubMed] [Google Scholar]
- 11.Jiao Y, et al. A genome triplication associated with early diversification of the core eudicots. Genome Biol. 2012;13:R3. doi: 10.1186/gb-2012-13-1-r3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Bowers JE, Chapman BA, Rong J, Paterson AH. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature. 2003;422:433–438. doi: 10.1038/nature01521. [DOI] [PubMed] [Google Scholar]
- 13.Monnahan P, et al. Pervasive population genomic consequences of genome duplication in Arabidopsis arenosa. Nat. Ecol. Evol. 2019;3:457–468. doi: 10.1038/s41559-019-0807-4. [DOI] [PubMed] [Google Scholar]
- 14.Schmutz J, et al. Genome sequence of the palaeopolyploid soybean. Nature. 2010;463:178–183. doi: 10.1038/nature08670. [DOI] [PubMed] [Google Scholar]
- 15.Iorizzo M, et al. A high-quality carrot genome assembly provides new insights into carotenoid accumulation and asterid genome evolution. Nat. Genet. 2016;48:657–666. doi: 10.1038/ng.3565. [DOI] [PubMed] [Google Scholar]
- 16.Ibarra-Laclette E, et al. Architecture and evolution of a minute plant genome. Nature. 2013;498:94–98. doi: 10.1038/nature12132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Bock DG, Kane NC, Ebert DP, Rieseberg LH. Genome skimming reveals the origin of the Jerusalem Artichoke tuber crop species: neither from Jerusalem nor an artichoke. N. Phytologist. 2014;201:1021–1030. doi: 10.1111/nph.12560. [DOI] [PubMed] [Google Scholar]
- 18.Mandáková T, Pouch M, Brock JR, Al-Shehbaz IA, Lysak MA. Origin and evolution of diploid and allopolyploid Camelina genomes was accompanied by chromosome shattering. Plant Cell. 2019;31:2596–2612. doi: 10.1105/tpc.19.00366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Aköz G, Nordborg M. The Aquilegia genome reveals a hybrid origin of core eudicots. Genome Biol. 2019;20:256. doi: 10.1186/s13059-019-1888-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Moghe GD, et al. Consequences of whole-genome triplication as revealed by comparative genomic analyses of the wild radish Raphanus raphanistrum and three other Brassicaceae species. Plant Cell. 2014;26:1925–1937. doi: 10.1105/tpc.114.124297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Kagale S, et al. The emerging biofuel crop Camelina sativa retains a highly undifferentiated hexaploid genome structure. Nat. Commun. 2014;5:3706. doi: 10.1038/ncomms4706. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Cheng S, et al. The Tarenaya hassleriana genome provides insight into reproductive trait and genome evolution of Crucifers. Plant Cell. 2013;25:2813–2830. doi: 10.1105/tpc.113.113480. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Wang J, et al. Recursive paleohexaploidization shaped the Durian genome. Plant Physiol. 2019;179:209–219. doi: 10.1104/pp.18.00921. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Wang Z, et al. The genome of Hibiscus hamabo reveals its adaptation to saline and waterlogged habitat. Hortic. Res. 2022;9:uhac067. doi: 10.1093/hr/uhac067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Tu L, et al. Genome of Tripterygium wilfordii and identification of cytochrome P450 involved in triptolide biosynthesis. Nat. Commun. 2020;11:971. doi: 10.1038/s41467-020-14776-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Hane JK, et al. A comprehensive draft genome sequence for lupin (Lupinus angustifolius), an emerging health food: insights into plant-microbe interactions and legume evolution. Plant Biotechnol. J. 2017;15:318–330. doi: 10.1111/pbi.12615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Lee ES, et al. Engineering homoeologs provide a fine scale for quantitative traits in polyploid. Plant Biotechnol. J. 2023;21:2458–2472. doi: 10.1111/pbi.14141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Bombarely A, et al. Insight into the evolution of the Solanaceae from the parental genomes of Petunia hybrida. Nat. Plants. 2016;2:16074. doi: 10.1038/nplants.2016.74. [DOI] [PubMed] [Google Scholar]
- 29.Sun G, et al. Large-scale gene losses underlie the genome evolution of parasitic plant Cuscuta australis. Nat. Commun. 2018;9:2683. doi: 10.1038/s41467-018-04721-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Liang Y, et al. The genome of Eustoma grandiflorum reveals the whole‐genome triplication event contributing to ornamental traits in cultivated lisianthus. Plant Biotechnol. J. 2022;20:1856–1858. doi: 10.1111/pbi.13899. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Liu B, et al. Mikania micrantha genome provides insights into the molecular mechanism of rapid growth. Nat. Commun. 2020;11:340. doi: 10.1038/s41467-019-13926-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Palfalvi G, et al. Genomes of the venus flytrap and close relatives unveil the roots of plant carnivory. Curr. Biol. 2020;30:2312–2320.e5. doi: 10.1016/j.cub.2020.04.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Song A, et al. Analyses of a chromosome-scale genome assembly reveal the origin and evolution of cultivated chrysanthemum. Nat. Commun. 2023;14:2021. doi: 10.1038/s41467-023-37730-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Nakano M, et al. A chromosome-level genome sequence of Chrysanthemum seticuspe, a model species for hexaploid cultivated chrysanthemum. Commun. Biol. 2021;4:1167. doi: 10.1038/s42003-021-02704-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Baduel P, Bray S, Vallejo-Marin M, Kolář F, Yant L. The “Polyploid Hop”: shifting challenges and opportunities over the evolutionary lifespan of genome duplications. Front Ecol. Evol. 2018;6:117. doi: 10.3389/fevo.2018.00117. [DOI] [Google Scholar]
- 36.Comai L. The advantages and disadvantages of being polyploid. Nat. Rev. Genet. 2005;6:836–846. doi: 10.1038/nrg1711. [DOI] [PubMed] [Google Scholar]
- 37.Meeus S, Šemberová K, De Storme N, Geelen D, Vallejo-Marín M. Effect of whole-genome duplication on the evolutionary rescue of sterile hybrid monkeyflowers. Plant Commun. 2020;1:100093. doi: 10.1016/j.xplc.2020.100093. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Selmecki AM, et al. Polyploidy can drive rapid adaptation in yeast. Nature. 2015;519:349–352. doi: 10.1038/nature14187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Ebadi M, et al. The duplication of genomes and genetic networks and its potential for evolutionary adaptation and survival during environmental turmoil. Proc. Natl Acad. Sci. 2023;120:e2307289120. doi: 10.1073/pnas.2307289120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Rudman SM, et al. What genomic data can reveal about eco-evolutionary dynamics. Nat. Ecol. Evol. 2018;2:9–15. doi: 10.1038/s41559-017-0385-2. [DOI] [PubMed] [Google Scholar]
- 41.Aguirre-Liguori JA, Ramírez-Barahona S, Gaut BS. The evolutionary genomics of species’ responses to climate change. Nat. Ecol. Evol. 2021;5:1350–1360. doi: 10.1038/s41559-021-01526-9. [DOI] [PubMed] [Google Scholar]
- 42.Borevitz J. Utilizing genomics to understand and respond to global climate change. Genome Biol. 2021;22:91. doi: 10.1186/s13059-021-02317-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Vanneste K, Baele G, Maere S, Van de Peer Y. Analysis of 41 plant genomes supports a wave of successful genome duplications in association with the Cretaceous–Paleogene boundary. Genome Res. 2014;24:1334–1347. doi: 10.1101/gr.168997.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Soltis PS, Soltis DE. Ancient WGD events as drivers of key innovations in angiosperms. Curr. Opin. Plant Biol. 2016;30:159–165. doi: 10.1016/j.pbi.2016.03.015. [DOI] [PubMed] [Google Scholar]
- 45.Cai L, et al. Widespread ancient whole‐genome duplications in Malpighiales coincide with Eocene global climatic upheaval. N. Phytologist. 2019;221:565–576. doi: 10.1111/nph.15357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Benton MJ, Wilf P, Sauquet H. The Angiosperm Terrestrial Revolution and the origins of modern biodiversity. N. Phytologist. 2022;233:2017–2035. doi: 10.1111/nph.17822. [DOI] [PubMed] [Google Scholar]
- 47.Heimhofer U, Hochuli PA, Burla S, Dinis JML, Weissert H. Timing of Early Cretaceous angiosperm diversification and possible links to major paleoenvironmental change. Geology. 2005;33:141. doi: 10.1130/G21053.1. [DOI] [Google Scholar]
- 48.Schulte P, et al. The Chicxulub asteroid impact and mass extinction at the Cretaceous-Paleogene boundary. Science. 2010;327:1214–1218. doi: 10.1126/science.1177265. [DOI] [PubMed] [Google Scholar]
- 49.Zhang L, et al. The ancient wave of polyploidization events in flowering plants and their facilitated adaptation to environmental stress. Plant Cell Environ. 2020;43:2847–2856. doi: 10.1111/pce.13898. [DOI] [PubMed] [Google Scholar]
- 50.Duke NC, et al. A world without mangroves? Science. 2007;317:41–42. doi: 10.1126/science.317.5834.41b. [DOI] [PubMed] [Google Scholar]
- 51.Tomlinson, P. B. The Botany Of Mangroves 2nd edn. (Cambridge Univ. Press, 2016).
- 52.He Z, et al. Speciation with gene flow via cycles of isolation and migration: insights from multiple mangrove taxa. Natl Sci. Rev. 2019;6:275–288. doi: 10.1093/nsr/nwy078. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.He Z, et al. De novo assembly of coding sequences of the mangrove palm (Nypa fruticans) using RNA-seq and discovery of whole-genome duplications in the ancestor of palms. PLoS ONE. 2015;10:e0145385. doi: 10.1371/journal.pone.0145385. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.He Z, et al. Convergent adaptation of the genomes of woody plants at the land–sea interface. Natl Sci. Rev. 2020;7:978–993. doi: 10.1093/nsr/nwaa027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Xu S, et al. The origin, diversification and adaptation of a major mangrove clade (Rhizophoreae) revealed by whole-genome sequencing. Natl Sci. Rev. 2017;4:721–734. doi: 10.1093/nsr/nwx065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Xu S, et al. Where whole‐genome duplication is most beneficial: adaptation of mangroves to a wide salinity range between land and sea. Mol. Ecol. 2023;32:460–475. doi: 10.1111/mec.16320. [DOI] [PubMed] [Google Scholar]
- 57.Feng X, et al. Genomic insights into molecular adaptation to intertidal environments in the mangrove Aegiceras corniculatum. N. Phytologist. 2021;231:2346–2358. doi: 10.1111/nph.17551. [DOI] [PubMed] [Google Scholar]
- 58.Mandáková T, Lysak MA. Post-polyploid diploidization and diversification through dysploid changes. Curr. Opin. Plant Biol. 2018;42:55–65. doi: 10.1016/j.pbi.2018.03.001. [DOI] [PubMed] [Google Scholar]
- 59.Wendel JF. The wondrous cycles of polyploidy in plants. Am. J. Bot. 2015;102:1753–1756. doi: 10.3732/ajb.1500320. [DOI] [PubMed] [Google Scholar]
- 60.Murat F, Armero A, Pont C, Klopp C, Salse J. Reconstructing the genome of the most recent common ancestor of flowering plants. Nat. Genet. 2017;49:490–496. doi: 10.1038/ng.3813. [DOI] [PubMed] [Google Scholar]
- 61.Pont C, et al. Paleogenomics: reconstruction of plant evolutionary trajectories from modern and ancient DNA. Genome Biol. 2019;20:29. doi: 10.1186/s13059-019-1627-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.He Z, et al. Evolution of coastal forests based on a full set of mangrove genomes. Nat. Ecol. Evol. 2022;6:738–749. doi: 10.1038/s41559-022-01744-9. [DOI] [PubMed] [Google Scholar]
- 63.Lyu H, He Z, Wu C-I, Shi S. Convergent adaptive evolution in marginal environments: unloading transposable elements as a common strategy among mangrove genomes. N. Phytologist. 2018;217:428–438. doi: 10.1111/nph.14784. [DOI] [PubMed] [Google Scholar]
- 64.Petrov DA. Evolution of genome size: new approaches to an old problem. Trends Genet. 2001;17:23–28. doi: 10.1016/S0168-9525(00)02157-0. [DOI] [PubMed] [Google Scholar]
- 65.Bennetzen JL, Wang H. The contributions of transposable elements to the structure, function, and evolution of plant genomes. Annu. Rev. Plant Biol. 2014;65:505–530. doi: 10.1146/annurev-arplant-050213-035811. [DOI] [PubMed] [Google Scholar]
- 66.Ou S, Jiang N. LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 2018;176:1410–1422. doi: 10.1104/pp.17.01310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Kumar S, et al. TimeTree 5: an expanded resource for species divergence times. Mol. Biol. Evol. 2022;39:msac174. doi: 10.1093/molbev/msac174. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Shi T, Chen J. A reappraisal of the phylogenetic placement of the Aquilegia whole-genome duplication. Genome Biol. 2020;21:295. doi: 10.1186/s13059-020-02212-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Roelofs D, et al. Multi-faceted analysis provides little evidence for recurrent whole-genome duplications during hexapod evolution. BMC Biol. 2020;18:57. doi: 10.1186/s12915-020-00789-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Wang Y, et al. An ancient whole-genome duplication event and its contribution to flavor compounds in the tea plant (Camellia sinensis) Hortic. Res. 2021;8:176. doi: 10.1038/s41438-021-00613-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.De Smet R, Van de Peer Y. Redundancy and rewiring of genetic networks following genome-wide duplication events. Curr. Opin. Plant Biol. 2012;15:168–176. doi: 10.1016/j.pbi.2012.01.003. [DOI] [PubMed] [Google Scholar]
- 72.Zachos JC, Dickens GR, Zeebe RE. An early Cenozoic perspective on greenhouse warming and carbon-cycle dynamics. Nature. 2008;451:279–283. doi: 10.1038/nature06588. [DOI] [PubMed] [Google Scholar]
- 73.The French–Italian Public Consortium for Grapevine Genome Characterization. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature. 2007;449:463–467. doi: 10.1038/nature06148. [DOI] [PubMed] [Google Scholar]
- 74.Armarego-Marriott T. Doubled genome an asset. Nat. Clim. Chang. 2020;10:184. doi: 10.1038/s41558-020-0726-z. [DOI] [Google Scholar]
- 75.Stevens AV, Nicotra AB, Godfree RC, Guja LK. Polyploidy affects the seed, dormancy and seedling characteristics of a perennial grass, conferring an advantage in stressful climates. Plant Biol. 2020;22:500–513. doi: 10.1111/plb.13094. [DOI] [PubMed] [Google Scholar]
- 76.Bowers JE, Paterson AH. Chromosome number is key to longevity of polyploid lineages. N. Phytologist. 2021;231:19–28. doi: 10.1111/nph.17361. [DOI] [PubMed] [Google Scholar]
- 77.Doyle JJ, Coate JE. Polyploidy, the nucleotype, and novelty: the impact of genome doubling on the biology of the cell. Int J. Plant Sci. 2019;180:1–52. doi: 10.1086/700636. [DOI] [Google Scholar]
- 78.Yant L, Bomblies K. Genome management and mismanagement—cell-level opportunities and challenges of whole-genome duplication. Genes Dev. 2015;29:2405–2419. doi: 10.1101/gad.271072.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Van de Peer Y, Ashman T-L, Soltis PS, Soltis DE. Polyploidy: an evolutionary and ecological force in stressful times. Plant Cell. 2021;33:11–26. doi: 10.1093/plcell/koaa015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Yu H, et al. A route to de novo domestication of wild allotetraploid rice. Cell. 2021;184:1156–1170.e14. doi: 10.1016/j.cell.2021.01.013. [DOI] [PubMed] [Google Scholar]
- 81.Levin DA. Minority cytotype exclusion in local plant populations. Taxon. 1975;24:35–43. doi: 10.2307/1218997. [DOI] [Google Scholar]
- 82.Burton TL, Husband BC. Fitness differences among diploids, tetraploids, and their triploid progeny in Chamerion angustifolium: mechanisms of inviability and implications for polyploid evolution. Evolution. 2000;54:1182–1191. doi: 10.1111/j.0014-3820.2000.tb00553.x. [DOI] [PubMed] [Google Scholar]
- 83.Myburg AA, et al. The genome of Eucalyptus grandis. Nature. 2014;510:356–362. doi: 10.1038/nature13308. [DOI] [PubMed] [Google Scholar]
- 84.Van de Peer Y, Maere S, Meyer A. The evolutionary significance of ancient genome duplications. Nat. Rev. Genet. 2009;10:725–732. doi: 10.1038/nrg2600. [DOI] [PubMed] [Google Scholar]
- 85.Li Z, et al. Patterns and processes of diploidization in land plants. Annu. Rev. Plant Biol. 2021;72:387–410. doi: 10.1146/annurev-arplant-050718-100344. [DOI] [PubMed] [Google Scholar]
- 86.Sessa EB. Polyploidy as a mechanism for surviving global change. N. Phytologist. 2019;221:5–6. doi: 10.1111/nph.15513. [DOI] [PubMed] [Google Scholar]
- 87.Carretero‐Paulet L, Van de Peer Y. The evolutionary conundrum of whole‐genome duplication. Am. J. Bot. 2020;107:1101–1105. doi: 10.1002/ajb2.1520. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Keightley PD, Eyre-Walker A. Joint inference of the distribution of fitness effects of deleterious mutations and population demography based on nucleotide polymorphism frequencies. Genetics. 2007;177:2251–2261. doi: 10.1534/genetics.107.080663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Eyre-Walker A, Keightley PD. Estimating the rate of adaptive molecular evolution in the presence of slightly deleterious mutations and population size change. Mol. Biol. Evol. 2009;26:2097–2108. doi: 10.1093/molbev/msp119. [DOI] [PubMed] [Google Scholar]
- 90.Schneider A, Charlesworth B, Eyre-Walker A, Keightley PD. A method for inferring the rate of occurrence and fitness effects of advantageous mutations. Genetics. 2011;189:1427–1437. doi: 10.1534/genetics.111.131730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Chen Q, et al. Two decades of suspect evidence for adaptive molecular evolution—negative selection confounding positive-selection signals. Natl Sci. Rev. 2022;9:nwab217. doi: 10.1093/nsr/nwab217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Eilertson KE, Booth JG, Bustamante CD. SnIPRE: selection inference using a Poisson random effects model. PLoS Comput. Biol. 2012;8:e1002806. doi: 10.1371/journal.pcbi.1002806. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Shapiro JA, et al. Adaptive genic evolution in the Drosophila genomes. Proc. Natl Acad. Sci. 2007;104:2271–2276. doi: 10.1073/pnas.0610385104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Ball MC. Mangrove species richness in relation to salinity and waterlogging: a case study along the Adelaide River floodplain, northern Australia. Glob. Ecol. Biogeogr. Lett. 1998;7:73. doi: 10.2307/2997699. [DOI] [Google Scholar]
- 95.Singh A. Soil salinization and waterlogging: a threat to environment and agricultural sustainability. Ecol. Indic. 2015;57:128–130. doi: 10.1016/j.ecolind.2015.04.027. [DOI] [Google Scholar]
- 96.Freeling M. Bias in plant gene content following different sorts of duplication: tandem, whole-genome, segmental, or by transposition. Annu. Rev. Plant Biol. 2009;60:433–453. doi: 10.1146/annurev.arplant.043008.092122. [DOI] [PubMed] [Google Scholar]
- 97.Sun F, et al. Salt modulates gravity signaling pathway to regulate growth direction of primary roots in Arabidopsis. Plant Physiol. 2008;146:178–188. doi: 10.1104/pp.107.109413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Zhang Y, Xiao G, Wang X, Zhang X, Friml J. Evolution of fast root gravitropism in seed plants. Nat. Commun. 2019;10:3480. doi: 10.1038/s41467-019-11471-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Friml J, Wiśniewska J, Benková E, Mendgen K, Palme K. Lateral relocation of auxin efflux regulator PIN3 mediates tropism in. Arabidopsis. Nat. 2002;415:806–809. doi: 10.1038/415806a. [DOI] [PubMed] [Google Scholar]
- 100.Han EH, Petrella DP, Blakeslee JJ. ‘Bending’ models of halotropism: incorporating protein phosphatase 2A, ABCB transporters, and auxin metabolism. J. Exp. Bot. 2017;68:3071–3089. doi: 10.1093/jxb/erx127. [DOI] [PubMed] [Google Scholar]
- 101.Gao H-B, Chu Y-J, Xue H-W. Phosphatidic acid (PA) binds PP2AA1 to regulate PP2A activity and PIN1 polar localization. Mol. Plant. 2013;6:1692–1702. doi: 10.1093/mp/sst076. [DOI] [PubMed] [Google Scholar]
- 102.Julkowska MM, Testerink C. Tuning plant signaling and growth to survive salt. Trends Plant Sci. 2015;20:586–594. doi: 10.1016/j.tplants.2015.06.008. [DOI] [PubMed] [Google Scholar]
- 103.Skubacz A, Daszkowska-Golec A, Szarejko I. The role and regulation of ABI5 (ABA-Insensitive 5) in plant development, abiotic stress responses and phytohormone crosstalk. Front. Plant Sci. 2016;7:1884. doi: 10.3389/fpls.2016.01884. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Hundertmark M, Hincha DK. LEA (Late Embryogenesis Abundant) proteins and their encoding genes in Arabidopsis thaliana. BMC Genomics. 2008;9:118. doi: 10.1186/1471-2164-9-118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Battaglia M, Olvera-Carrillo Y, Garciarrubio A, Campos F, Covarrubias AA. The enigmatic LEA proteins and other hydrophilins. Plant Physiol. 2008;148:6–24. doi: 10.1104/pp.108.120725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Abe H, et al. Arabidopsis AtMYC2 (bHLH) and AtMYB2 (MYB) function as transcriptional activators in abscisic acid signaling. Plant Cell. 2003;15:63–78. doi: 10.1105/tpc.006130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Doyle JJ, Doyle JL. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochem Bull. 1987;19:11–15. [Google Scholar]
- 108.Vurture GW, et al. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics. 2017;33:2202–2204. doi: 10.1093/bioinformatics/btx153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Ruan J, Li H. Fast and accurate long-read assembly with wtdbg2. Nat. Methods. 2020;17:155–158. doi: 10.1038/s41592-019-0669-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Chin C-S, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat. Methods. 2013;10:563–569. doi: 10.1038/nmeth.2474. [DOI] [PubMed] [Google Scholar]
- 111.Walker BJ, et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE. 2014;9:e112963. doi: 10.1371/journal.pone.0112963. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Servant N, et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 2015;16:259. doi: 10.1186/s13059-015-0831-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Durand NC, et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 2016;3:95–98. doi: 10.1016/j.cels.2016.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Durand NC, et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 2016;3:99–101. doi: 10.1016/j.cels.2015.07.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Dudchenko O, et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science. 2017;356:92–95. doi: 10.1126/science.aal3327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Tarailo‐Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinform. 2009;25:1–14. doi: 10.1002/0471250953.bi0410s25. [DOI] [PubMed] [Google Scholar]
- 117.Xu Z, Wang H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007;35:W265–W268. doi: 10.1093/nar/gkm286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Stanke M, et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 2006;34:W435–W439. doi: 10.1093/nar/gkl200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Lomsadze A. Gene identification in novel eukaryotic genomes by self-training algorithm. Nucleic Acids Res. 2005;33:6494–6506. doi: 10.1093/nar/gki937. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Slater GSC, Birney E. Automated generation of heuristics for biological sequence comparison. BMC Bioinform. 2005;6:31. doi: 10.1186/1471-2105-6-31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Trapnell C, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 2012;7:562–578. doi: 10.1038/nprot.2012.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Kim D, et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013;14:R36. doi: 10.1186/gb-2013-14-4-r36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Haas BJ, et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments. Genome Biol. 2008;9:R7. doi: 10.1186/gb-2008-9-1-r7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124.Zheng Y, et al. iTAK: a program for genome-wide prediction and classification of plant transcription factors, transcriptional regulators, and protein kinases. Mol. Plant. 2016;9:1667–1670. doi: 10.1016/j.molp.2016.09.014. [DOI] [PubMed] [Google Scholar]
- 125.Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol. Biol. Evol. 2021;38:4647–4654. doi: 10.1093/molbev/msab199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Lamesch P, et al. The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res. 2012;40:D1202–D1210. doi: 10.1093/nar/gkr1090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Shi T, et al. Distinct expression and methylation patterns for genes with different fates following a single whole-genome duplication in flowering plants. Mol. Biol. Evol. 2020;37:2394–2413. doi: 10.1093/molbev/msaa105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 128.Luo X, et al. The pomegranate (Punica granatum L.) draft genome dissects genetic divergence between soft‐ and hard‐seeded cultivars. Plant Biotechnol. J. 2020;18:955–968. doi: 10.1111/pbi.13260. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 129.Emms DM, Kelly S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20:238. doi: 10.1186/s13059-019-1832-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 2013;30:772–780. doi: 10.1093/molbev/mst010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 131.Suyama M, Torrents D, Bork P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 2006;34:W609–W612. doi: 10.1093/nar/gkl315. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 132.Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 2000;17:540–552. doi: 10.1093/oxfordjournals.molbev.a026334. [DOI] [PubMed] [Google Scholar]
- 133.Kozlov AM, Darriba D, Flouri T, Morel B, Stamatakis A. RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics. 2019;35:4453–4455. doi: 10.1093/bioinformatics/btz305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 2007;24:1586–1591. doi: 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
- 135.Dos Reis M, Yang Z. Approximate likelihood calculation on a phylogeny for Bayesian estimation of divergence times. Mol. Biol. Evol. 2011;28:2161–2172. doi: 10.1093/molbev/msr045. [DOI] [PubMed] [Google Scholar]
- 136.Morris JL, et al. The timescale of early land plant evolution. Proc. Natl Acad. Sci. 2018;115:E2274–E2283. doi: 10.1073/pnas.1719588115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 137.Graham SA. Fossil records in the Lythraceae. Botanical Rev. 2013;79:48–145. doi: 10.1007/s12229-012-9116-1. [DOI] [Google Scholar]
- 138.Coiro M, Doyle JA, Hilton J. How deep is the conflict between molecular and fossil evidence on the age of angiosperms? N. Phytologist. 2019;223:83–99. doi: 10.1111/nph.15708. [DOI] [PubMed] [Google Scholar]
- 139.Zhang L, et al. The water lily genome and the early evolution of flowering plants. Nature. 2020;577:79–84. doi: 10.1038/s41586-019-1852-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.Yu G, Smith DK, Zhu H, Guan Y, Lam TT. GGTREE: an package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol. Evol. 2017;8:28–36. doi: 10.1111/2041-210X.12628. [DOI] [Google Scholar]
- 141.Wang Y, et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 2012;40:e49. doi: 10.1093/nar/gkr1293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 142.Krzywinski M, et al. Circos: an information aesthetic for comparative genomics. Genome Res. 2009;19:1639–1645. doi: 10.1101/gr.092759.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143.Wang D, Zhang Y, Zhang Z, Zhu J, Yu J. KaKs_Calculator 2.0: a toolkit incorporating gamma-series methods and sliding window strategies. Genomics Proteom. Bioinform. 2010;8:77–80. doi: 10.1016/S1672-0229(10)60008-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 144.Li Z, et al. Multiple large-scale gene and genome duplications during the evolution of hexapods. Proc. Natl Acad. Sci. 2018;115:4713–4718. doi: 10.1073/pnas.1710791115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 145.Stolzer M, et al. Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees. Bioinformatics. 2012;28:i409–i415. doi: 10.1093/bioinformatics/bts386. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 146.Rabier C-E, Ta T, Ané C. Detecting and locating whole genome duplications on a phylogeny: a probabilistic approach. Mol. Biol. Evol. 2014;31:750–762. doi: 10.1093/molbev/mst263. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 147.Mendes FK, Vanderpool D, Fulton B, Hahn MW. CAFE 5 models variation in evolutionary rates among gene families. Bioinformatics. 2021;36:5516–5518. doi: 10.1093/bioinformatics/btaa1022. [DOI] [PubMed] [Google Scholar]
- 148.Clark JW, Donoghue PCJ. Constraining the timing of whole genome duplication in plant evolutionary history. Proc. R. Soc. B: Biol. Sci. 2017;284:20170912. doi: 10.1098/rspb.2017.0912. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 149.Glick L, Mayrose I. ChromEvol: assessing the pattern of chromosome number evolution and the inference of polyploidy along a phylogeny. Mol. Biol. Evol. 2014;31:1914–1922. doi: 10.1093/molbev/msu122. [DOI] [PubMed] [Google Scholar]
- 150.Graham SA, Oginuma K, Raven PH, Tobe H. Chromosome numbers in Sonneratia and Duabanga (Lythraceae s.l.) and their systematic significance. Taxon. 1993;42:35–41. doi: 10.2307/1223300. [DOI] [Google Scholar]
- 151.Rice A, et al. The Chromosome Counts Database (CCDB)—a community resource of plant chromosome numbers. N. Phytologist. 2015;206:19–26. doi: 10.1111/nph.13191. [DOI] [PubMed] [Google Scholar]
- 152.Sun P, et al. WGDI: a user-friendly toolkit for evolutionary analyses of whole-genome duplications and ancestral karyotypes. Mol. Plant. 2022;15:1841–1851. doi: 10.1016/j.molp.2022.10.018. [DOI] [PubMed] [Google Scholar]
- 153.Salse J. Ancestors of modern plant crops. Curr. Opin. Plant Biol. 2016;30:134–142. doi: 10.1016/j.pbi.2016.02.005. [DOI] [PubMed] [Google Scholar]
- 154.Tang H, et al. Synteny and collinearity in plant genomes. Science. 2008;320:486–488. doi: 10.1126/science.1153917. [DOI] [PubMed] [Google Scholar]
- 155.Cox MP, Peterson DA, Biggs PJ. SolexaQA: at-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinform. 2010;11:485. doi: 10.1186/1471-2105-11-485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 156.Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 2019;37:907–915. doi: 10.1038/s41587-019-0201-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 157.Anders S, Pyl PT, Huber W. HTSeq–a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31:166–169. doi: 10.1093/bioinformatics/btu638. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 158.Gu K, Ng HKT, Tang ML, Schucany WR. Testing the ratio of two Poisson rates. Biometrical J. 2008;50:283–298. doi: 10.1002/bimj.200710403. [DOI] [PubMed] [Google Scholar]
- 159.Roulin A, et al. The fate of duplicated genes in a polyploid plant genome. Plant J. 2013;73:143–153. doi: 10.1111/tpj.12026. [DOI] [PubMed] [Google Scholar]
- 160.Liu S, et al. The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes. Nat. Commun. 2014;5:3930. doi: 10.1038/ncomms4930. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 161.Feng X, et al. Expansion and adaptive evolution of the WRKY transcription factor family in Avicennia mangrove trees. Mar. Life Sci. Technol. 2023;5:155–168. doi: 10.1007/s42995-023-00177-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 162.Shannon P, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–2504. doi: 10.1101/gr.1239303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 163.Feng X, et al. Molecular adaptation to salinity fluctuation in tropical intertidal environments of a mangrove tree Sonneratia alba. BMC Plant Biol. 2020;20:178. doi: 10.1186/s12870-020-02395-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 164.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The raw genomic Illumina reads, PacBio reads, Hi-C reads, and RNA-seq reads reported in this paper have been deposited in the Genome Sequence Archive (GSA, https://ngdc.cncb.ac.cn/gsa) in National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences / China National Center for Bioinformation, under accession number CRA004284 with BioProject ID PRJCA005319. The genome assembly sequences have been deposited in the Genome Warehouse (GWH, https://ngdc.cncb.ac.cn/gwh) in National Genomics Data Center under accession number GWHBCIQ00000000 [https://ngdc.cncb.ac.cn/gwh/Assembly/20653/show], GWHBCKL00000000 [https://ngdc.cncb.ac.cn/gwh/Assembly/20692/show] with BioProject ID PRJCA004930 and BioSample ID SAMC353197, SAMC353201. The genome assemblies and annotations are also available at Figshare: Sonneratia alba [10.6084/m9.figshare.25118819], Lagerstroemia speciosa [10.6084/m9.figshare.25118831]. Source data are provided with this paper.