Abstract
Background
Bemisia afer is a globally distributed whitefly species and a significant agricultural pest, yet the genomic and functional roles of its obligate endosymbiont remain poorly understood. The primary endosymbiont of whiteflies belongs to the genus Candidatus Portiera. Portiera is essential for host survival, providing nutritional supplementation and facilitating ecological adaptation, but its evolutionary dynamics and host-specific adaptations in B. afer are largely unexplored. Comparative genomic studies of Portiera from other whitefly species have revealed distinct evolutionary patterns, yet no such data exist for B. afer, highlighting a critical knowledge gap.
Results
We present the first complete genome of Portiera BeAf, the obligate endosymbiont of B. afer. The genome exhibits classic signatures of reductive evolution, including extreme AT bias (25.3% GC content), high coding density (74.7%), and significant gene loss, particularly in DNA replication and repair pathway and lysine biosynthesis pathway. Average Nucleotide Identity values below the species threshold of 95% between Portiera BeAf and known symbionts support its designation as a novel species. Phylogenetic analyses place Portiera BeAf within a clade sister to B. tabaci-associated symbionts, yet reveal unique structural rearrangements and lineage-specific gene losses. Notably, Portiera BeAf harbors specific hypothetical proteins, including a putative ABCD4-like transporter, suggesting potential adaptations in nutrient transport or stress response. Comparative genomics further demonstrate weakened codon usage bias and accelerated substitution rates in Bemisia-associated Portiera, reflecting relaxed selection in their obligate symbiotic niche.
Conclusions
Our study provides foundational insights into the genomic architecture and evolutionary trajectory of Portiera in B. afer, revealing both conserved and divergent features compared to other whitefly symbionts. The loss of key metabolic and repair genes underscores the role of host compensation in maintaining symbiont functionality, while lineage-specific innovations may reflect adaptations to host ecological demands. These findings advance our understanding of Portiera's genomic diversity and highlight the complex interplay between reductive evolution and host-symbiont coadaptation in ancient symbiotic systems.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12864-025-12509-6.
Keywords: Bemisia afer, Portiera, Genome instability, ABC transporter, Lysine biosynthesis
Background
Bemisia afer (Priesner & Hosny) (Hemiptera: Aleyrodidae) is a genetically diverse whitefly species with a broad geographical distribution, spanning sub-Saharan Africa, Asia, Australia, Europe, and South America [1–6]. It is a significant agricultural pest, not only due to direct damage from phloem feeding but also its role as a vector of plant viruses. Notably, it efficiently transmits Sweet potato chlorotic stunt virus, causing severe yield losses in sweet potato (Ipomoea batatas) [7]. Its polyphagous nature, feeding on over 20 plant families, and sympatric distribution with other whitefly vectors further heighten concerns about its potential to spread viral pathogens in crops like cassava (Manihot esculenta) [8–10]. These highlight the need to understand the biological factors driving its ecological success and epidemiological impact. Among these factors, microbial endosymbionts are hypothesized to influence their hosts’ adaptation [11], nutritional ecology [12], and virus transmission efficiency [13], though their genomic and functional roles in this species remain unexplored.
Obligate endosymbionts, such as the bacterial genus Candidatus Portiera in whiteflies, are essential for the survival and ecological adaptation of their hosts through multifaceted biological functions. Primarily, Portiera compensates for the nutrient-deficient phloem diet of the whitefly B. tabaci by synthesizing essential amino acids (EAAs) and vitamins, such as pantothenate, which are critical for host growth, reproduction, and stress resistance [14, 15]. Despite its highly reduced genome (≤ 351 kb), Portiera retains key metabolic pathways, though it relies on host-derived genes (e.g., GOT2 for phenylalanine synthesis and panBC for pantothenate production) or secondary symbiont-derived gene (e.g., dapB, dapF, and lysA for lysine synthesis) to complete these processes, exemplifying a tightly coevolved metabolic interdependence [16, 17]. Additionally, Portiera contributes to the host’s cuticle formation by supplying phenylalanine-derived dopamine, enhancing resistance to entomopathogenic fungi and environmental stressors [18]. Beyond nutrition, Portiera modulates host physiology, including vitellogenin-mediated oogenesis and symbiont transmission, ensuring vertical persistence [19, 20]. Collectively, Portiera exemplifies a keystone symbiont whose diverse functions have been extensively characterized in B. tabaci. However, research on its biological roles in B. afer remains strikingly limited, highlighting a critical knowledge gap in understanding the evolutionary conservation and host-specific adaptations of this essential symbiont across whitefly species.
To date, the complete genomes of nine Portiera lineages have been reported, obtained from metagenomes of five whitefly species, Aleurodicus dispersus, Aleurodicus floccissimus, Aleyrodes shizuokensis, Pealius mori, and Trialeurodes vaporariorum, as well as four cryptic species of B. tabaci, namely MEAM1 (formerly B biotype), MED (formerly Q biotype), Asia II 3 (formerly ZHJ1 biotype), and China 1 (formerly ZHJ3 biotype) [21–27]. Ca. Portiera is proposed as the genus for whitefly obligate endosymbionts and is thought to contain a single species Ca. Portiera aleyrodidarum, with the endosymbiont of B. tabaci (NCBI accession number AY268082) as its type species [28]. Recently, comparative genomics reveals three distinct genetic groups of Portiera: one from Aleurodicinae genus Aleurodicus, another from Aleyrodinae genus Bemisia, and a third from other Aleyrodinae genera Aleyrodes, Pealius, and Trialeurodes. The Average Nucleotide Identity (ANI) values among these groups fall below the 95% threshold, implying that they may represent distinct species [24].
B. tabaci-associated Portiera exhibit accelerated genomic instability, characterized by extensive rearrangements, repetitive intergenic regions, and loss of DNA repair genes (e.g., dnaQ), contrasting with the structural stability observed in T. vaporariorum-associated bacteria [23, 26]. The genomic architecture of B. afer’s Portiera (BeAf) likely mirrors the instability of its Aleurolobini relatives, driven by ancestral dnaQ loss [25]. However, empirical validation is critical to confirm the degree of genomic instability and understand its impact on this poorly studied symbiotic relationship.
Here, we present the first complete genome of Portiera BeAf, revealing novel gene arrangements. To investigate its genomic features and evolutionary significance, we addressed the following key research questions: (i) What are the structural and functional characteristics of the Portiera BeAf genome, and how do they compare with other Portiera lineages? (ii) How do the identified genomic features, such as gene gains, losses, and rearrangements, influence the symbiotic functions of Portiera BeAf? and (iii) What are the implications of these findings for understanding the co-evolutionary relationships between Portiera and its host? By addressing these questions, we aimed to elucidate the genomic diversity and evolutionary trajectory of Portiera BeAf, providing new insights into bacterial endosymbiont adaptation and host-symbiont interactions.
Methods
Genome sequencing
Genomes of intracellular symbionts can be assembled using raw reads obtained from attempts to sequence the genome of a host. In our previous study, we sequenced the genomic DNA of B. afer [29]. Briefly, B. afer specimens were collected from Linyi, Shandong Province, China, on its host plant, Abutilon Miller. Genomic DNA was extracted using DNeasy Blood & Tissue Kit (Qiagen, San Diego, California, USA) following the manufacturer’s instructions. A total of 20 μL of DNA was used to construct a paired-end library with 150 bp read lengths, and then sequenced on a HiSeq 2500 system (Illumina, San Diego, California, USA). We used the raw sequencing data from the reads linked to the host genome and available online (NCBI accession number SRR25460813) to assemble the genome of Portiera BeAf in this study.
Genome assembly and annotation
All bioinformatic programs and software were used with default parameters unless otherwise stated. Detailed parameters are provided in Supplementary Table S1. The raw sequencing data were filtered using fastp (v0.20.0) [30] to remove adapter contamination, filter low-quality reads, and correct wrongly represented bases. To eliminate the impact of host DNA on the assembly results, we extracted the potential reads of Portiera based on the reference genomes of nine Portiera lineages (Table 1). Briefly, we used Bowtie 2 (v2.5.4) [31] to build the index and mapped the clean reads to the reference genomes. SAMtools (v1.22) [32] and BedTools (v2.31.0) [33] were used to convert the resultant SAM format files to BED format, followed by extracting read names from the BED file. Then seqtk (v1.5) [34] was used to retrieve sequences from the clean reads based on these names. As a result, 125,199,775 bp forward reads and 125,162,012 bp reverse reads were retrieved for Portiera genome assembly. Unicycler (v0.4.8) [35] was used for de novo genome assembly, and it resulted in 26 contigs. In order to complete the circular genome, we used every contig as a seed to extend the sequence using NOVOPlasty (v4.3.3) [36]. Among them, one contig (Supplementary Data S1) was successfully extended to acquire a circular genome. To evaluate the sequencing quality, BWA (v0.7.17) [37] and SAMtools were used to calculate the sequencing depth per site. The completeness of the genome assembly for all Portiera lineages, including BeAf, was evaluated using BUSCO v5.5.0 [38] against the bacteria_odb10 dataset in ‘genome’ mode. To confirm that the assembled sequence belongs to the Portiera genome, we evaluated its similarity with nine other known genomes using CompàreGenome [39]. In brief, the homologous genes were identified through BLAST + (2.14.0) [40], and the alignments were scored using the Reference Similarity Score (RSS). The Pearson correlation coefficients were calculated based on the pairwise alignment scores. To annotate the genome, the circular sequence was submitted to Bakta (v1.11.3) [41], a rapid and standardized annotation tool of bacterial genomes. Bakta integrates multiple tools for annotating bacterial genomes: Pyrodigal (v3.6.3) [42] predicts coding sequences (CDSs), tRNAscan-SE (v2.0.0) [43] predicts tRNAs, ARAGORN (v1.2.41) [44] predicts tmRNAs, and Infernal (v1.1.5) [45] predicts rRNAs and ncRNAs. Then the dnaK gene was designated as the origin of the genome. CGView (v2.0.3) [46] was used to draw the circular map of the genome, as well as presenting the GC content, GC skew, and sequencing depth per site. The genome was finally submitted to the National Genomics Data Center, China National Center for Bioinformation (CNCB-NGDC) [47]. To assess the species designation of Portiera BeAf, pairwise ANI values for all pairs of lineages were calculated using fastANI (v1.31) [48] with a fragment length of 1,000 bp. ANI values below the 95% species delineation threshold indicate separate species [49].
Table 1.
Complete Candidatus Portiera genomes used in this study
| Lineage | Host species | Database | Accession number | Reference |
|---|---|---|---|---|
| AdSh | Aleyrodes shizuokensis | NGDCa | GWHBOVO00000000 | [24] |
| AlDi | Aleurodicus dispersus | NCBIb | LN649255 | [23] |
| AlFl | Aleurodicus floccissimus | NCBI | LN734649 | [23] |
| BeAf | Bemisia afer | NGDC | GWHGGFS01000000 | This study |
| BTB | Bemisia tabaci MEAM1 | NCBI | CP003708 | [22] |
| BTQ | Bemisia tabaci MED | NCBI | CP003835 | [21] |
| BTZ1 | Bemisia tabaci Asia II 3 | NCBI | CP016327 | [27] |
| BTZ3 | Bemisia tabaci China1 | NCBI | CP016343 | [27] |
| PeMo | Pealius mori | NCBI | LR744089 | [25] |
| TrVa | Trialeurodes vaporariorum | NCBI | CP004358 | [26] |
aThe National Genomics Data Center, China National Center for Bioinformation, Chinese Academy of Sciences [47]
bThe National Center for Biotechnology Information
Genome synteny analysis
Genome synteny between Portiera BeAf and other lineages was analyzed using Python-based JCVI library (v1.5.7) [50]. Inter-genomic comparison was conducted between Portiera BeAf and every other lineage. The CDS sequences were pairwise aligned, and the synteny between the paired genomes was determined based on the alignment results and the genomic positions of the CDSs. The genomes of the nine known Portiera lineages exhibit two distinct gene arrangements: one found in the lineages of the B. tabaci cryptic species complex (BTB, BTQ, BTZ1, and BTZ3), and the other present in the remaining lineages (AlDi, AlFl, AdSh, PeMo, and TrVa) [24]. The gene arrangements of Portiera BeAf differed from the above lineages. To study genome synteny from a broader perspective, we defined blocks of five consecutive genes and conducted a comparative synteny analysis between the BeAf lineage and the BTB lineage, as well as the AlDi lineage.
Identification of orthologous genes
The amino acid sequences of all protein-coding genes (PCGs) were extracted from the 10 Portiera genomes, and orthologous genes were identified using OrthoFinder (v2.5.5) [51]. The orthologous gene overlaps across the 10 Portiera lineages were analyzed and visualized through Venn-compatible representations generated by UpSetR package [52] in R, with supplementary circular visualizations produced using plotrix package [53].
Phylogenetic tree construction
To further construct the phylogenetic tree of Portiera, we introduced an outgroup, Halomonas elongata (GenBank accession number NC_014532), and re-identified their single-copy orthologous genes using OrthoFinder. The nucleotide sequences of the single-copy orthologous genes from Portiera and the outgroup were used to construct the rooted phylogenetic tree. Each orthologous gene was aligned using MAFFT (v7.471) [54], and then trimmed with trimAl (v1.5.0) [55]. The trimmed gene sequences were concatenated using R package phylotools (v0.2.2) [56]. The partitioned alignment files were submitted to IQ-TREE (v2.0.6) [57] for maximum likelihood (ML) phylogenetic tree construction, with bootstrap support values calculated from 1,000 replicates. The optimal substitution model was selected under Bayesian Information Criterion (BIC). The tree was visualized using FigTree (v1.4.4) [58].
The host phylogeny was reconstructed to assess its congruence with the Portiera phylogeny. With the exception of the mitochondrial genomes for B. tabaci Asia II 3 and Aleurodicus floccissimus, which were unavailable, data for eight host species and the outgroup (Diaphorina citri) were obtained from NCBI. The phylogenetic tree was inferred using the 13 protein-coding genes from the mitochondrial genomes, employing the same methodology as described above.
Codon usage bias of orthologous genes
The codon usage bias of Portiera orthologous genes were evaluated by the nonsynonymous substitutions per nonsynonymous site (Ka), the synonymous substitutions per synonymous site (Ks), the effective number of codons (ENC), and the Codon Adaptation Index (CAI). The Ka quantifies nucleotide mutations in protein-coding sequences that result in amino acid changes. The Ks captures silent mutations that do not alter the encoded amino acid. A Ka/Ks ratio > > 1 implies positive selection where the amino acid changes confer adaptive advantages. The Ka and Ks values of orthologous genes were computed for all lineages relative to the reference AlDi lineage using KaKs_Calculator 2.0 toolkit [59]. The relationship between Ka and Ks values was visualized using scatter plots generated with ggplot2 package [60] in R. The ENC estimates the number of equally used codons that would generate the observed codon bias, ranging from 20 (extreme bias, only one codon per amino acid) to 61 (no bias, all codons used equally). An ENC value < 35 implies strong codon bias, indicating preferential use of specific codons [61]. The ENC values of all orthologous genes in each lineage were calculated by CodonW (v1.4.4) [62]. A density plot was generated with ggpointdensity [63] and ggplot2 packages in R to analyze the distribution patterns of the ENC values. The CAI reflects the similarity between a gene's codon usage and that of a reference set of highly expressed genes. It ranges from 0 (no similarity) to 1 (perfect match), with higher values indicating stronger bias toward optimal codons [64]. The reference set of highly expressed genes were acquired from Codon Usage Database [65] with the set ID “Candidatus Portiera aleyrodidarum [gbbct]: 31”. The CAI values of all orthologous genes in each lineage were calculated by CAIcal_ECAI (v1.4) [66]. A density plot was generated to analyze the distribution patterns of the CAI values.
Functional annotation of genes absent from Portiera of Bemisia
Ortholog analysis revealed the absence of 17 conserved genes across all Portiera lineages from B. afer and B. tabaci cryptic species. To characterize the potential functional implications of these gene losses, we conducted the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis and the Clusters of Orthologous Groups (COG) functional categorization. The KEGG identifiers of all genes were mapped to KEGG Orthology (KO) numbers and subsequently analyzed using KEGG pathway mapping tool [67] to determine their functional pathway associations. R packages ggsci [68], ggpubr [69], scales [70], and ggplot2 were used to visualize the category of KEGG pathways. The COG identifiers were mapped to COG functional categories [71]. The number of genes in each category was counted to evaluate the functional enrichment of these genes.
Functional annotation of genes unique to Portiera of Bemisia
The Portiera symbionts in Bemisia possess unique orthologous genes, all of which encode hypothetical proteins. The amino acid sequences of these hypothetical proteins were submitted to InterPro [72, 73] to classify them into families and predict domains and important sites. The protein signatures were visualized using R package gggenes [74]. Furthermore, we used SWISS-MODEL [75, 76] to predict the protein structure of a putative ATP-binding cassette (ABC) transporter gene uniquely encoded in the Portiera BeAf. The predicted model was visualized using Chimera (v1.19) [77]. To further investigate the function of this putative ABC transporter gene, subcellular localization was performed using DeepLoc (v2.0) [78].
Results
Genomic features of Portiera BeAf
The genome of Portiera BeAf comprises a single circular chromosome of 356,426 bp, with no plasmids detected. The genome exhibits an average GC content of 25.3%, implying an extreme AT-bias in the genome. It exhibits a positive GC skew of 0.0709. Compared to the nine registered Portiera complete genomes, its genome size is more similar to BTB, BTQ, BTZ1, and BTZ3 lineages (349.1 kb–350.1 kb), while being notably larger than AdSh, AlDi, AlFl, PeMo, and TrVa lineages (271.2 kb–283.6 kb). The genome contains 273 CDSs, three rRNA genes, 35 tRNA genes, and a single tmRNA gene, with a coding density of 74.7% (Fig. 1A). The coding density is slightly higher than that of BTB, BTQ, BTZ1, and BTZ3 lineages (67.6%–67.9%), but notably lower than that of AdSh, AlDi, AlFl, PeMo, and TrVa lineages (91.7%–94.9%). The BUSCO assessment performed across all Portiera lineages revealed universal incompleteness (Supplementary Fig. S1). Specifically, the BeAf lineage contained 48.4% of complete BUSCOs, which is comparable to the other nine lineages (ranging from 46.0% to 53.2%). Within the complete category, Portiera BeAf showed a duplicated proportion of 3.2%, which was not observed in other lineages. The fragmented proportion observed in Portiera BeAf (8.9%) fell within the range observed in all analyzed lineages (6.5%–12.9%). The Pearson correlation coefficients were derived from the RSS of orthologous genes across 10 Portiera lineages. The BeAf lineage exhibited lower correlation values with other lineages, suggesting divergent evolutionary patterns (Fig. 1B).
Fig. 1.
Genome assembly of Portiera BeAf. A The circular map of Portiera BeAf genome. From outer to inner: genes (including CDS, tRNA, rRNA, and tmRNA) on the forward and reverse strands, GC content, GC skew, and sequencing depth per site. The average sequencing depth is 1,514 ×. B The Pearson correlation coefficient heatmap of the Reference Similarity Score (RSS) values for genomic orthologous genes. C The diagram of genome synteny between BeAf and two representative lineages, AlDi and BTB. BeAf, AlDi, and BTB represent the Portiera lineages from Bemisia afer, Aleurodicus disperses, and B. tabaci MEAM1, respectively
Our previous synteny analysis identified two distinct gene arrangement patterns among Portiera lineages. One was found in B. tabaci-associated lineages, BTB, BTQ, BTZ1, and BTZ3. The other occurred in lineages, AdSh, AlDi, AlFl, PeMo, and TrVa, from other whitefly species [24]. Notably, the new BeAf lineage showed a unique genomic organization (Supplementary Fig. S2). While BeAf’s genome organization showed closer affinity to the synteny of B. tabaci Portiera endosymbionts, it is characterized by large-scale genomic inversions, suggesting lineage-specific structural rearrangements (Fig. 1C).
The pairwise ANI values between the BeAf lineage and all other lineages were below the species boundary threshold of 95%, implying that BeAf represents a novel species (Fig. 2). In contrast, ANI values exceeded 99% within the BTB, BTQ, BTZ1, and BTZ3 group, and were greater than 94% within the AdSh, PeMo, and TrVa group. These genomic distinctions supported the separation of BeAf from the other lineages.
Fig. 2.
Average nucleotide identity matrix among different Portiera lineages. Numerical values are shown in the lower left, with corresponding color-coded circles in the upper right
Phylogeny of Portiera
To clarify the phylogenetic position of BeAf, we further identified orthologous genes among these 10 Portiera lineages, and constructed a ML phylogenetic tree based on the single-copy orthologous genes. A total of 2,629 out of 2,665 genes from the 10 lineages were assigned into 282 orthogroups. The high classification ratio (98.7%) suggested a high degree of evolutionary conservation among these genes. Among these orthogroups, 218 (77.3%) were conserved across all lineages. Notably, 17 orthogroups (6.0%) were exclusively present in AdSh, PeMo, TrVa, AlDi, and AlFl, while six orthogroups (2.1%) were unique to BTB, BTQ, BTZ1, and BTZ3. An additional six orthogroups (2.1%) were unique to AlDi and AlFl. Portiera BeAf possessed one unique orthogroup, while two additional orthogroups were exclusively shared among BeAf, BTB, BTQ, BTZ1, and BTZ3 lineages (Fig. 3A). Among the 10 Portiera lineages, we identified 190 single-copy orthologs. However, inclusion of an outgroup reduced this number to 139. We subsequently used these 139 conserved single-copy orthologs to construct a rooted ML phylogenetic tree for determining the phylogenetic position of BeAf. All Portiera branches exhibited substantially shorter branch lengths compared to the outgroup, suggesting slower evolutionary rates within this clade. The Portiera phylogeny showed near congruence with the phylogeny of their whitefly hosts, suggesting co-diversification (Supplementary Fig. S3). The topology demonstrated that the BeAf lineage formed a distinct monophyletic lineage, which was sister to a well-supported clade comprising BTB, BTQ, BTZ1, and BTZ3 lineages (Fig. 3B). This phylogenetic separation of BeAf further supported its status as a distinct species within Portiera.
Fig. 3.
Phylogeny of Portiera lineages. A The Upset plot and the floral diagram of the orthogroups among the 10 Portiera lineages. BTB, BTQ, BTZ1, BTZ3, BeAf, AdSh, PeMo, TrVa, AlDi, and AlFl represent the Portiera lineages from B. tabaci MEAM1, B. tabaci MED, B. tabaci Asia II 3, B. tabaci China1, B. afer, Aleyrodes shizuokensis, Pealius mori, Trialeurodes vaporariorum, Aleurodicus disperses, and Aleurodicus floccissimus, respectively. B The maximum likelihood (ML) phylogenetic tree inferred from Portiera single-copy orthologs. The tree was constructed with 1,000 ultrafast bootstrap replicates, and bootstrap values are shown at the nodes. Scale bar represents 0.1 amino acid substitutions per site. The tree was rooted using the Halomonas elongata (HaEl), a closely related species
Codon usage bias of Portiera
Based on the phylogenetic tree of Portiera, the most ancestral AlDi lineage was selected as the reference genome to calculate Ka and Ks values for other lineages. The AlFl lineage exhibited the lowest Ka and Ks values, while the AdSh, PeMo, and TrVa lineages showed elevated and comparable Ka values, as well as Ks values. The BTB, BTQ, BTZ1, BTZ3, and BeAf lineages displayed the highest Ka and Ks values, which were also closely clustered. These findings are consistent with the phylogenetic analysis, demonstrating that more distantly related taxa exhibit higher rates of both nonsynonymous and synonymous substitutions. Among all 2,272 genes analyzed, only 19 (0.84%) exhibited Ka > Ks, with the maximum Ka/Ks ratio being 1.22, suggesting an overall absence of positive selection and a predominance of purifying selection across the genomes. The distribution of Ka and Ks values for the BeAf lineage in the scatter plot closely matched those of BTB, BTQ, BTZ1, and BTZ3, indicating parallel evolutionary patterns (Fig. 4A). We further compared codon usage bias among lineages using the ENC and CAI. The BTB (ENC = 35.84 ± 0.23), BTQ (35.90 ± 0.22), BTZ1 (35.90 ± 0.22), BTZ3 (35.76 ± 0.23), and BeAf (35.18 ± 0.21) lineages exhibited mean ENC values > 35, indicating relatively weak codon usage bias and lower gene expression levels. In contrast, the AdSh (31.04 ± 0.18), PeMo (31.81 ± 0.19), TrVa (32.33 ± 0.18), AlDi (32.85 ± 0.22), and AlFl (32.37 ± 0.20) lineages showed mean ENC values < 35, suggesting stronger codon usage bias and higher gene expression levels (Fig. 4B). Consistent with these findings, the mean CAI values of BTB (CAI = 0.76 ± 0.00), BTQ (0.76 ± 0.00), BTZ1 (0.76 ± 0.00), BTZ3 (0.76 ± 0.00), and BeAf (0.77 ± 0.00) were substantially lower than those of AdSh (0.86 ± 0.00), PeMo (0.85 ± 0.00), TrVa (0.84 ± 0.00), AlDi (0.83 ± 0.00), and AlFl (0.84 ± 0.00), further supporting reduced gene expression in the former group (Fig. 4C).
Fig. 4.
Codon usage bias of Portiera orthologous genes. A The scatter plot of nonsynonymous substitution rate (Ka) versus synonymous substitution rate (Ks) of Portiera orthologous genes with reference to those of Portiera AlDi. The dashed line with a slope of 1 represents equal values of Ka and Ks. Points above this line (Ka > Ks) suggest positive selection acting on these genes. Points below this line (Ka < Ks) suggest purifying selection. B The density plot of the effective number of codons (ENC) values for orthologous genes. C The density plot of the Codon Adaptation Index (CAI) values for orthologous genes
Functional analysis of genes absent from Portiera of Bemisia whitefly
Seventeen genes were identified as orthologs in the AlDi, AlFl, AdSh, PeMo, and TrVa lineages, but were absent in BTB, BTQ, BTZ1, BTZ3, and BeAf lineages from Bemisia whitefly (Table 2). KEGG pathway analysis indicated these genes were primarily associated with the replication and repair pathway and amino acid metabolism pathway (Fig. 5A & Supplementary Table S2). COG functional classification also showed overrepresentation in [L] replication, recombination and repair and [E] amino acid transport and metabolism categories (Supplementary Fig. S4). Among these, eight genes (including dnaQ) were functionally annotated to the replication and repair pathway, consistent with findings by Santos-Garcia et al. [25], suggesting a potential link to Portiera genome instability in Bemisia species (Supplementary Fig. S5). Intriguingly, while Zhu et al. [27] and Zhu et al. [79] reported the loss of lysine biosynthesis genes (dapB, lysA, and dapF) in the Portiera genome of B. tabaci, our study further revealed that lysA and dapF were exclusively absent in Bemisia’s Portiera. The dapB gene in B. tabaci's Portiera lineages has been truncated into a pseudogene due to a premature stop codon, whereas it remains intact in BeAf lineage and in other lineages possessing the dapF and lysA genes (Fig. 5B–C; Supplementary Fig. S6).
Table 2.
List of genes absent from Portiera of Bemisia whitefly
| Gene | KEGG | COG | Description |
|---|---|---|---|
| dapF | K01778 | COG0253 | Diaminopimelate epimerase |
| deaD | K05592 | COG0513 | ATP-dependent RNA helicase |
| dnaN | K02338 | COG0592 | DNA polymerase III subunit beta |
| dnaQ | K02342 | COG0847 | DNA polymerase III subunit epsilon |
| dnaX | K02343 | COG2812 | DNA polymerase III subunit tau |
| era | K03595 | COG1159 | GTPase |
| frr | K02838 | COG0233 | Ribosome-recycling factor |
| holA | K02340 | COG1466 | DNA polymerase III subunit delta |
| holB | K02341 | COG0470 | DNA polymerase III subunit delta' |
| htrA | K04771 | COG0265 | Putative serine protease |
| ksgA | K02528 | COG0030 | Ribosomal RNA small subunit methyltransferase A |
| lspA | K03101 | COG0597 | Lipoprotein signal peptidase |
| lysA | K01586 | COG0019 | Diaminopimelate decarboxylase |
| mutL | K03572 | COG0323 | DNA mismatch repair protein |
| ruvC | K01159 | COG0817 | Crossover junction endodeoxyribonuclease |
| ssb | K03111 | COG0629 | Single-stranded DNA-binding protein |
| trpS | K01867 | COG0180 | Tryptophan–tRNA ligase |
Fig. 5.
Functional analysis of genes absent from Portiera of Bemisia. A KEGG pathway classification of orthologous genes those are present in Portiera AlDi, AlFl, AdSh, PeMo, and TrVa genomes, but absent from Portiera BeAf, BTB, BTQ, BTZ1, and BTZ3 genomes. B Genes involved in the biosynthesis pathway of lysine. The presence of a complete gene is presented in blue, and a pseudogene is indicated in gray. Blank indicates absence of the gene from the genome. C Protein sequence alignment of dapB genes from Portiera lineages. The partial alignment shown here reveals the presence of premature stop codons in the sequences of B. tabaci-associated Portiera lineages. The complete alignment is provided in Supplementary Fig. S6
Signatures of genes unique to Portier of Bemisia
All genes uniquely identified in Bemisia-associated Portiera through ortholog analysis were annotated as hypothetical protein genes. Ortholog analysis of Portiera lineages BTB, BTQ, BTZ1, BTZ3, and BeAf identified two conserved hypothetical protein genes, designated HP1 and HP2. Notably, HP1 exhibited structural variation between lineages: in BTB, BTQ, BTZ1, and BTZ3, it was present as a truncated variant (HP1_1), whereas in BeAf, it appeared as a longer version (HP1_2), with HP1_1 representing a partial sequence of HP1_2. Additionally, lineages BTB, BTQ, BTZ1, and BTZ3 harbored six unique hypothetical protein genes (HP3–HP8) absent in BeAf. Conversely, BeAf possessed a lineage-specific hypothetical protein gene (HP9) not shared by the other four lineages (Fig. 6A). The amino acid sequences of these hypothetical proteins are provided as Supplementary Data S2.
Fig. 6.
Signatures of genes unique to Portier lineages of Bemisia. A The presence and absence of hypothetical proteins (HPs) in different Portiera lineages. Presence is shown in blue and absence in white. B The structure of genes predicted from amino acid sequences. C The predicted protein structure of HP1_2, a putative ATP-binding cassette (ABC) transporter gene uniquely encoded in the Portiera BeAf. The model is built with template 6tqf.1.A. The homodimer is represented with its two monomers colored red and blue, respectively. HP1_1 (from BTB, BTQ, BTZ1, and BTZ3) and HP1_2 (from BeAf) are orthologous, but HP1_1's sequence constitutes a partial segment of HP1_2. HP2 is shared by all the five lineages. HP3–HP8 are shared by BTB, BTQ, BTZ1, and BTZ3. HP9 is unique to the BeAf
To investigate the potential functions of these hypothetical proteins, we performed comprehensive sequence analyses against the InterPro database. Domain prediction revealed that HP1 belongs to the ATP-binding cassette (ABC) subfamily D (IPR050835), and HP5 was classified as a Major facilitator superfamily (MFS) member with sugar transporter-like domains (IPR005828). Subcellular localization predictions were successful for HP1_1, HP1_2, HP3, HP4, HP5, HP7, and HP9. HP1_2 contained multiple transmembrane domains, with HP1_1 representing an extracellular segment of HP1_2. HP3, HP7, and HP9 also exhibited transmembrane architectures. HP4 and HP5 possessed signal peptides and extracellular domains, suggesting their potential roles as novel secretory proteins (Fig. 6B). These computational predictions provide the first structural insights into these previously uncharacterized proteins, although experimental validation will be required to confirm their biological functions.
To elucidate the structural characteristics of the BeAf-specific HP1_2 variant, we performed homology modeling using SWISS-MODEL. The predicted three-dimensional structure revealed HP1_2 forms a homodimeric complex, exhibiting both transmembrane domains and a conserved ATP-binding domain (Fig. 6C & Supplementary Data S3). This architecture is characteristic of functional ABC transporters, consistent with its classification in the ABC transporter family. The ABC protein subfamily D comprises ABCD1, ABCD2, ABCD3, and ABCD4 in eukaryotes. Among them, ABCD1, ABCD2, and ABCD3 are localized on the peroxisomal membrane. ABCD4 is initially integrated into the endoplasmic reticulum (ER) membrane and is then transported to lysosomes [80]. We predicted subcellular localization of the HP1_2 and human ABCD1–4. Based on the results, human ABCD1–3 were all predicted to localize to peroxisomes, while HP1_2, like human ABCD4, was predicted to localize to the endoplasmic reticulum (Table 3). The subcellular localization prediction which groups HP1_2 with ABCD4 suggests a potential functional analogy in their transport mechanisms, despite the differing cellular compartments.
Table 3.
Subcellular localization prediction of HP1_2 and human ABC proteins subfamily D
| Localization | Thresholda | ABCD1 | ABCD2 | ABCD3 | ABCD4 | HP1_2 |
|---|---|---|---|---|---|---|
| Cytoplasm | 0.4761 | 0.1154 | 0.0921 | 0.0814 | 0.0911 | 0.2396 |
| Nucleus | 0.5014 | 0.1208 | 0.1452 | 0.1343 | 0.1286 | 0.1669 |
| Extracellular | 0.6173 | 0.0582 | 0.0497 | 0.0379 | 0.0521 | 0.0249 |
| Cell membrane | 0.5646 | 0.1489 | 0.1955 | 0.2378 | 0.2461 | 0.2955 |
| Mitochondrion | 0.6220 | 0.7596 | 0.7757 | 0.6946 | 0.2402 | 0.1675 |
| Plastid | 0.6395 | 0.0450 | 0.0961 | 0.0327 | 0.0023 | 0.0009 |
| Endoplasmic reticulum | 0.6090 | 0.4709 | 0.4082 | 0.5121 | 0.7475 | 0.7720 |
| Lysosome/Vacuole | 0.5848 | 0.1941 | 0.1474 | 0.1772 | 0.5344 | 0.4822 |
| Golgi apparatus | 0.6494 | 0.1546 | 0.1107 | 0.1792 | 0.4231 | 0.7040 |
| Peroxisome | 0.7364 | 0.9299 | 0.9014 | 0.9023 | 0.4438 | 0.0444 |
aA localization is predicted if its probability is above the threshold. The prediction is conducted using DeepLoc (v2.0), a tool for prediction of eukaryotic protein subcellular localization using deep learning
Discussion
Genome reduction
The genome of Portiera BeAf exhibits classic signatures of reductive evolution, including a small size (356 kb), extreme AT bias (25.3% GC content), and high coding density (74.7%), mirroring patterns seen in other obligate bacterial symbionts [81, 82]. This reduction primarily reflects relaxed selection and genetic drift in the stable intracellular environment, as demonstrated in diverse symbiotic systems ranging from Buchnera in aphids to Polynucleobacter in ciliates [83, 84]. Notably, Portiera lineages from Bemisia exhibit weaker codon usage bias (ENC > 35) compared to other whitefly symbionts (ENC < 35), suggesting reduced selective pressures on gene expression optimization. This parallels observations in Serratia symbiotica, where genome erosion occurs without immediate fitness consequences due to metabolic complementation by co-symbionts [85]. The near absence of positively selected genes (Ka/Ks < 1) further supports the dominance of purifying selection in maintaining essential functions while permitting loss of redundant pathways. Interestingly, while some symbionts like Photodesmus retain motility genes for environmental transmission [86], Portiera's extreme genome reduction reflects strict vertical transmission and complete host dependency [87, 88]. This contrasts with extracellular symbionts that maintain larger genomes despite host association, highlighting how transmission mode shapes reductive trajectories [89, 90]. The convergence toward small genomes across phylogenetically distinct symbionts (e.g., Sulcia [91], Carsonella [92]) underscores the powerful selective advantages of metabolic streamlining in long-term host associations [93, 94].
Genome rearrangements and instability
The observed genome rearrangements and instability in Portiera lineages of B. afer and B. tabaci highlight a lineage-specific pattern of genomic plasticity. While previous studies attributed this instability primarily to the loss of dnaQ, the DNA polymerase III subunit epsilon [25], our findings reveal a more comprehensive explanation: Bemisia-associated Portiera lineages have lost not only dnaQ but also multiple other genes involved in the replication and repair pathway (e.g., dnaN, holA, ssb). The cumulative loss of these critical repair genes likely exacerbates genomic instability by impairing DNA proofreading, mismatch repair, and recombination resolution. This contrasts sharply with the structural conservation seen in Portiera lineages from other whitefly species, such as T. vaporariorum, where retained DNA repair mechanisms maintain genomic stability [26]. The absence of these repair pathways may lead to increased mutation rates and recombination-mediated rearrangements, particularly in repetitive intergenic regions, which serve as substrates for such events. Similar patterns of instability have been documented in other obligate symbionts, such as S. symbiotica, where the loss of repair genes and proliferation of mobile elements contribute to dynamic genome architectures [85]. Notably, the distinct rearrangement patterns between B. afer and B. tabaci suggest independent evolutionary trajectories, possibly influenced by host-specific selective pressures or compensatory mechanisms from facultative symbionts [25, 95]. These findings underscore the role of relaxed selection and genetic drift in shaping symbiont genome evolution, where the incomplete replication and repair pathway exacerbates structural instability without immediate fitness costs [96]. Further studies are needed to elucidate whether selfish genetic elements, such as restriction-modification systems, also contribute to the observed rearrangements, as seen in other bacterial symbionts [97, 98].
Lysine biosynthesis
Our findings reveal a striking divergence in lysine biosynthesis capabilities between Portiera lineages associated with Bemisia and those from other whitefly genera, underscoring the dynamic nature of genomic erosion in obligate symbionts. While Portiera of B. tabaci exhibits pseudogenization of dapB and complete loss of lysA and dapF [27, 79], our study demonstrates that the Portiera of B. afer has retained an intact dapB gene despite the shared loss of lysA and dapF with B. tabaci-associated lineages, suggesting a host species-specific degradation of lysine synthesis genes in endosymbiont genome. This contrasts sharply with the retention of intact lysine pathways in Portiera from other whiteflies, mirroring the metabolic conservation observed in Buchnera aphidicola, where dapD remains functional despite reductive evolution [99]. The loss of lysA in Bemisia’s Portiera parallels findings in Blattabacterium sp., the symbiont of cockroaches, where pseudogenization of lysA disrupts lysine production, necessitating host reliance on dietary or microbial supplementation [100]. However, unlike Blattabacterium, Bemisia compensates via horizontally acquired lysA and dapF [16, 101], a strategy reminiscent of the brown planthopper Nilaparvata lugens, which leverages yeast-like symbionts to restore lysine biosynthesis [102]. The absence of compensatory Hamiltonella-derived lysine synthesis in Bemisia-associated lineages [101] and the lineage-specific pseudogenization of dapB in B. tabaci-associated lineages further highlight the uniqueness of host species-specific gene loss. This evolutionary trajectory aligns with the broader theme of symbiont genome reduction being offset by host genetic innovations or multi-partner symbioses [103], emphasizing the interplay between genomic decay and metabolic resilience in symbiotic systems.
Unique genes in Portiera of Bemisia
We predicted an ABCD4 like protein unique to Portiera BeAf. Although its ortholog is also present in the genome of B. tabaci's Portiera, it may not function similarly to the ABCD4 gene due to its incomplete sequence. Eukaryotic ABCD4 is involved in the transport of vitamin B12 (cobalamin) from lysosomes into the cytoplasm, playing a critical role in cobalamin metabolism and cellular homeostasis. Bacteria do not have a strict ABCD like eukaryotes, but some bacterial ABC transporters perform similar functions, such as MsbA for lipid transport [104] or MacB for drug efflux [105]. The ABCD4 like protein, HP1_2, is predicted to function in substrate transport across membranes, possibly related to nutrient uptake or detoxification.
Another hypothetical protein, HP5, was classified as an MFS member with sugar transporter-like domains. It is exclusively detected in the Portiera of B. tabaci but not in other whitefly species. Members in this family are responsible for the binding and transport of various carbohydrates, organic alcohols, and acids in prokaryotic and eukaryotic organisms [106]. HP5 might facilitate the transport of sugar metabolites derived from the host’s phloem sap, ensuring a steady energy supply for the symbiont’s metabolic activities. It may also mediate the transport of organic alcohols or acids, which might be critical for maintaining the symbiotic homeostasis or modulating Portiera’s physiology under specific nutritional conditions. The absence of HP5 in other whitefly species implies that B. tabaci has evolved a distinct metabolic dependency on Portiera, possibly enhancing the efficiency of nutrient exchange in this particular system. Further functional studies, such as heterologous expression or knockdown experiments, could elucidate whether HP5 contributes to the fitness of B. tabaci by optimizing symbiotic nutrient provisioning or stress adaptation.
Among the remaining seven hypothetical proteins uniquely present in B. tabaci-associated Portiera lineages, no functional domains were predicted, making their functions difficult to infer. The last hypothetical protein is uniquely present in Portiera BeAf, but its transcriptional activity remains unverified. Further studies, such as proteomic detection or knockout experiments, could help determine whether these proteins are functionally expressed and whether they contribute to lineage-specific adaptations in Portiera.
Influence of facultative symbionts on Portiera genome evolution
The evolutionary trajectory of Portiera is likely influenced by the presence of facultative symbionts in whiteflies, which can compensate for the genomic decay of the obligate symbiont. For instance, in Bemisia tabaci, facultative symbionts retain genes involved in amino acids, B vitamin and cofactor biosynthesis, thereby contributing to host nutrition and potentially relaxing selective pressures on Portiera to maintain these pathways. Candidatus Hamiltonella defensa retains biosynthetic pathways for biotin and partially complements lysine biosynthesis, which are incomplete in Portiera [16, 101]. Similarly, Arsenophonus bemisiae and Wolbachia wTabA have been shown to retain partial or complete pathways for riboflavin and nicotinic acid [79]. This suggests that facultative symbionts may alter the selective landscape acting on Portiera, potentially accelerating its genomic degeneration in some pathways while enabling host survival. The presence of multiple symbionts within the same host likely facilitates such metabolic exchanges and genomic adjustments, underscoring the importance of considering the entire symbiotic community when studying the evolution of obligate endosymbionts.
Conclusions
This study presents the first complete genome of Portiera BeAf, the obligate endosymbiont of B. afer, revealing novel genomic features and evolutionary dynamics and establishing its status as a novel species within the genus Portiera. The genome exhibits classic signatures of reductive evolution, including extreme AT bias, high coding density, and significant gene loss, particularly in DNA replication and repair pathway and lysine biosynthesis pathway. Phylogenetic analyses confirm a close relationship between Portiera BeAf and lineages from B. tabaci, yet highlight unique structural rearrangements and lineage-specific gene losses. The identification of lineage-specific hypothetical proteins, such as the ABCD4 like transporter in BeAf, suggests potential adaptations in nutrient transport or stress response, though their functional roles require further validation. Comparative genomics reveals that Portiera lineages from Bemisia exhibit distinct evolutionary trajectories, characterized by weakened codon usage bias and accelerated substitution rates, reflecting relaxed selection in their obligate symbiotic niche. These findings deepen our understanding of Portiera's genomic diversity and adaptive strategies, while emphasizing the interplay between genome reduction, functional compensation, and host-specific adaptations in shaping the evolution of ancient bacterial symbionts.
Supplementary Information
Acknowledgements
Not applicable.
Abbreviations
- ABC
ATP-Binding Cassette
- ANI
Average Nucleotide Identity
- BIC
Bayesian Information Criterion
- BUSCO
Benchmarking Universal Single-Copy Orthologs
- CAI
Codon Adaptation Index
- CDS
Coding Sequence
- COG
Clusters of Orthologous Groups
- EAA
Essential Amino Acids
- ENC
Effective Number of Codons
- HP
Hypothetical Protein
- Ka
Nonsynonymous Substitutions per Nonsynonymous Site
- KEGG
Kyoto Encyclopedia of Genes and Genomes
- KO
KEGG Orthology
- Ks
Synonymous Substitutions per Synonymous Site
- ML
Maximum Likelihood
- MFS
Major Facilitator Superfamily
- NCBI
National Center for Biotechnology Information
- CNCB-NGDC
National Genomics Data Center, China National Center for Bioinformation
- PCG
Protein-Coding Genes
- RSS
Reference Similarity Score
Authors' contributions
TL and YQL designed the research. HLW performed laboratory work. YYW, YJC, and CCZ analyzed the data. YYW, YJC and HLW drafted the manuscript. All authors have read and approved the final manuscript.
Funding
This study was funded by the Earmarked Fund for China Agriculture Research System (No. CARS-23-C05), the Science & Technology Project of Taizhou (No. 25nya21), and the Undergraduate Innovation and Entrepreneurship Training Program (No. S202510350038).
Data availability
The data supporting the findings of this study are openly available in CNCB-NGDC [47]. The complete genome of *Portiera* BeAf is deposited in CNCB-NGDC under accession number GWHGGFS01000000.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interest
The authors declare no competing interests.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Yu-Yi Wang, Yi-Jia Chen and Hua-Ling Wang contributed equally to this work.
Contributor Information
Teng Lei, Email: leiteng@tzc.edu.cn.
Yin-Quan Liu, Email: yqliu@zju.edu.cn.
References
- 1.Maruthi MN, Rekha AR, Sseruwagi P, Hillocks RJ. Mitochondrial DNA variability and development of a PCR diagnostic test for populations of the whitefly Bemisia afer (Priesner and Hosny). Mol Biotechnol. 2007;35:31–40. [DOI] [PubMed] [Google Scholar]
- 2.Wang J-R, Song Z-Q, Du Y-Z. Six new record species of whiteflies (Hemiptera: Aleyrodidae) infesting Morus alba in China. J Insect Sci. 2014;14(1):274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Krause-Sakate R, Watanabe LFM, Gorayeb ES, da Silva FB, Alvarez DdL, Bello VH, et al. Population dynamics of whiteflies and associated viruses in South America: research progress and perspectives. Insects. 2020;11(12):847. [DOI] [PMC free article] [PubMed]
- 4.Khamis FM, Ombura FLO, Ajene IJ, Akutse KS, Subramanian S, Mohamed SA, et al. Mitogenomic analysis of diversity of key whitefly pests in Kenya and its implication to their sustainable management. Sci Rep. 2021;11:6348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Kepngop LRK, Wosula EN, Amour M, Ghomsi PGT, Wakam LN, Kansci G, et al. Genetic diversity of whiteflies colonizing crops and their associated endosymbionts in three agroecological zones of Cameroon. Insects. 2024;15(9):657. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Martin JH, Mifsud D, Rapisarda C. The whiteflies (Hemiptera: Aleyrodidae) of Europe and the Mediterranean Basin. Bull Entomol Res. 2000;90(5):407–48. [DOI] [PubMed] [Google Scholar]
- 7.Gamarra HA, Fuentes S, Morales FJ, Glover R, Malumphy C, Barker I. Bemisia afer sensu lato, a vector of Sweet potato chlorotic stunt virus. Plant Dis. 2010;94(5):510–4. [DOI] [PubMed] [Google Scholar]
- 8.Munguti FM, Kilalo DC, Nyaboga EN, Wosula EN, Macharia I, Mwango’mbe AW. Distribution and molecular diversity of whitefly species colonizing cassava in Kenya. Insects. 2021;12(10):875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Ally HM, Hamss HE, Simiand C, Maruthi MN, Colvin J, Omongo CA, et al. What has changed in the outbreaking populations of the severe crop pest whitefly species in cassava in two decades? Sci Rep. 2019;9:14796. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Namuddu A, Seal S, van Brunschot S, Malka O, Kabaalu R, Morin S, et al. Distribution of Bemisia tabaci in different agro-ecological regions in Uganda and the threat of vector-borne pandemics into new cassava growing areas. Front Sustain Food Syst. 2023;7:1068109. [Google Scholar]
- 11.Himler AG, Adachi-Hagimori T, Bergen JE, Kozuch A, Kelly SE, Tabashnik BE, et al. Rapid spread of a bacterial symbiont in an invasive whitefly is driven by fitness benefits and female bias. Science. 2011;332(6026):254–6. [DOI] [PubMed] [Google Scholar]
- 12.Fan Z-Y, Liu Y, He Z-Q, Wen Q, Chen X-Y, Khan MM, et al. Rickettsia infection benefits its whitefly hosts by manipulating their nutrition and defense. Insects. 2022;13(12):1161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Lei T, Zhao J, Wang HL, Liu YQ, Liu SS. Impact of a novel Rickettsia symbiont on the life history and virus transmission capacity of its host whitefly (Bemisia tabaci). Insect Sci. 2021;28(2):377–91. [DOI] [PubMed] [Google Scholar]
- 14.Douglas AE. Multiorganismal insects: diversity and function of resident microorganisms. Annu Rev Entomol. 2015;60(1):17–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Ren F-R, Sun X, Wang T-Y, Yan J-Y, Yao Y-L, Li C-Q, et al. Pantothenate mediates the coordination of whitefly and symbiont fitness. ISME J. 2021;15(6):1655–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Luan J-B, Chen W, Hasegawa DK, Simmons AM, Wintermantel WM, Ling K-S, et al. Metabolic coevolution in the bacterial symbiosis of whiteflies and related plant sap-feeding insects. Genome Biol Evol. 2015;7(9):2635–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Sun X, Liu BQ, Li CQ, Chen ZB, Xu XR, Luan JB. A novel microRNA regulates cooperation between symbiont and a laterally acquired gene in the regulation of pantothenate biosynthesis within Bemisia tabaci whiteflies. Mol Ecol. 2022;31(9):2611–24. [DOI] [PubMed] [Google Scholar]
- 18.Lv C, Zhang S-X, Hong J-S, Wang T-Y, Liu B-Q, Li C-Q, et al. The phenylalanine synthesized by whitefly-Portiera symbiosis enhances host survival under fungi infection. J Pest Sci. 2025;98:1949–61. [Google Scholar]
- 19.Sun X, Liu B-Q, Chen Z-B, Li C-Q, Li X-Y, Hong J-S, et al. Vitellogenin facilitates associations between the whitefly and a bacteriocyte symbiont. mBio. 2023;14(1):1–13. [DOI] [PMC free article] [PubMed]
- 20.Wang Y-B, Ren F-R, Yao Y-L, Sun X, Walling LL, Li N-N, et al. Intracellular symbionts drive sex ratio in the whitefly by facilitating fertilization and provisioning of B vitamins. ISME J. 2020;14(12):2923–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Santos-Garcia D, Farnier P-A, Beitia F, Zchori-Fein E, Vavre F, Mouton L, et al. Complete genome sequence of “Candidatus Portiera aleyrodidarum” BT-QVLC, an obligate symbiont that supplies amino acids and carotenoids to Bemisia tabaci. J Bacteriol. 2012;194(23):6654–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Sloan DB, Moran NA. Endosymbiotic bacteria as a source of carotenoids in whiteflies. Biol Lett. 2012;8(6):986–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Santos-Garcia D, Vargas-Chavez C, Moya A, Latorre A, Silva FJ. Genome evolution in the primary endosymbiont of whiteflies sheds light on their divergence. Genome Biol Evol. 2015;7(3):873–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Lei T, Luo N, Song C, Yu J, Zhou Y, Qi X, et al. Comparative genomics reveals three genetic groups of the whitefly obligate endosymbiont Candidatus Portiera aleyrodidarum. Insects. 2023;14(11):888. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Santos-Garcia D, Mestre-Rincon N, Ouvrard D, Zchori-Fein E, Morin S, Martinez-Romero E. Portiera gets wild: genome instability provides insights into the evolution of both whiteflies and their endosymbionts. Genome Biol Evol. 2020;12(11):2107–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Sloan DB, Moran NA. The evolution of genomic instability in the obligate endosymbionts of whiteflies. Genome Biol Evol. 2013;5(5):783–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Zhu D-T, Zou C, Ban F-X, Wang H-L, Wang X-W, Liu Y-Q. Conservation of transcriptional elements in the obligate symbiont of the whitefly Bemisia tabaci. PeerJ. 2019;7:e7477. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Thao ML, Baumann P. Evolutionary relationships of primary prokaryotic endosymbionts of whiteflies and their hosts. Appl Environ Microbiol. 2004;70(6):3401–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Wang H, Geng S, Liu S, Li Z, Cameron S, Lei T, et al. Unraveling the cryptic Bemisia tabaci species complex: Global phylogenomic analysis reveals evolutionary relationships and biogeographic patterns. Insect Sci. 2025. 10.1111/1744-7917.13501. [DOI] [PubMed] [Google Scholar]
- 30.Chen S. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. iMeta. 2023;2:e107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Quinlan AR, Hall IM. BEDtools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Seqtk. https://github.com/lh3/seqtk. Accessed on 1 July 2025.
- 35.Phillippy AM, Wick RR, Judd LM, Gorrie CL, Holt KE. Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol. 2017;13(6):e1005595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Dierckxsens N, Mardulyn P, Smits G. NOVOplasty: de novoassembly of organelle genomes from whole genome data. Nucleic Acids Res. 2016;45(4):e18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Simao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: accessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31(19):3210–2. [DOI] [PubMed] [Google Scholar]
- 39.Moro G, Atzeni R, Al-Subhi A, Marche MG. Compàregenome: a command-line tool for genomic diversity estimation in prokaryotes and eukaryotes. BMC Bioinf. 2025;26:14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Schwengers O, Jelonek L, Dieckmann MA, Beyvers S, Blom J, Goesmann A. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb Genom. 2021;7:000685. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Larralde M. Pyrodigal: Python bindings and interface to Prodigal, an efficient method for gene prediction in prokaryotes. J Open Source Softw. 2022;7(72):4296. [Google Scholar]
- 43.Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25(5):955–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Laslett D. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res. 2004;32(1):11–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Nawrocki EP, Eddy SR. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics. 2013;29(22):2933–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Stothard P, Grant JR, Van Domselaar G. Visualizing and comparing circular genomes using the CGView family of tools. Briefings Bioinf. 2019;20(4):1576–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.CNCB-NGDC members and partners. Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2022. Nucl Acids Res. 2022;50(D1):D27–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018;9:5114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Thompson CC, Chimetto L, Edwards RA, Swings J, Stackebrandt E, Thompson F. Microbial genomic taxonomy. BMC Genomics. 2013;14:913. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Tang H, Krishnakumar V, Zeng X, Xu Z, Taranto A, Lomas JS, et al. JCVI: a versatile toolkit for comparative genomics analysis. iMeta. 2024;3:e211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Emms DM, Kelly S. Orthofinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20:238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Conway JR, Lex A, Gehlenborg N. UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics. 2017;33(18):2938–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Lemon J. Plotrix: a package in the red light district of R. R-News. 2006;6(4):8–12. [Google Scholar]
- 54.Katoh K. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucl Acids Res. 2005;33(2):511–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. TrimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009;25(15):1972–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Phylotools. https://github.com/helixcn/phylotools. Accessed on 10 July 2025.
- 57.Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2015;32(1):268–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.FigTree. https://github.com/rambaut/figtree. Accessed on 10 July 2025.
- 59.Wang D, Zhang Y, Zhang Z, Zhu J, Yu J. KaKs_calculator 2.0: a toolkit incorporating Gamma-series methods and sliding window strategies. Genomics Proteomics Bioinf. 2010;8(1):77–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Wickham H. ggplot2: Elegant graphics for data analysis. New York: Springer-Verlag; 2016. [Google Scholar]
- 61.Wright F. The ‘effective number of codons’ used in a gene. Gene. 1990;87(1):23–9. [DOI] [PubMed] [Google Scholar]
- 62.Peden JF. Analysis of codon usage. [Doctoral dissertation]. Nottingham: University of Nottingham. 1999.
- 63.ggpointdensity. https://github.com/LKremer/ggpointdensity. Accessed on 31 October 2025.
- 64.Shap PM, Li W-H. The codon adaptation index - a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 1987;15(3):1281–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Nakamura Y, Gojobori T, Ikemura T. Codon usage tabulated from the international DNA sequence databases: status for the year 2000. Nucleic Acids Res. 2000;28:292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Puigbò P, Bravo IG, Garcia-Vallve S. CAIcal: a combined set of tools to assess codon usage adaptation. Biol Direct. 2008;3:38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Minoru K, Susumu G, Yoko S, Miho F, Mao T. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 2012;40(D1):D109–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Xiao N. ggsci: Scientific journal and sci-fi themed color palettes for 'ggplot2'. R package version 4.1.0. 2025.
- 69.Kassambara A. ggpubr: 'ggplot2' based publication ready plots. R package version 0.6.1. 2025.
- 70.Wickham H, Pedersen T, Seidel D . scales: Scale functions for visualization. R package version 1.4.0. 2025
- 71.Michael YG, Roberto VA, Svetlana K, Kira SM, Yuri IW, David L, et al. COG database update 2024. Nucl Acids Res. 2025;53(D1):D356–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Matthias B, Antonina A, Laise CF, Sara RC, Tiago G, Emma H, et al. InterPro: the protein sequence classification resource in 2025. Nucl Acids Res. 2025;53(D1):D444–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Blum M, Andreeva A, Florentino Laise C, Chuguransky Sara R, Grego T, Hobbs E, et al. InterPro: the protein sequence classification resource in 2025. Nucl Acids Res. 2025;53(D1):D444–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Wilkins D. gggenes: Draw gene arrow maps in 'ggplot2'. R package version 0.5.0. 2023.
- 75.Stefan B, Andrew W, Tjaart AP de Beer, Gerardo T, Gabriel S, Lorenza B, et al. The SWISS-MODEL Repository—new features and functionality. Nucl Acids Res. 2017;45(D1):D313–9. [DOI] [PMC free article] [PubMed]
- 76.Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, et al. SWISS-MODEL: homology modelling of protein structures and complexes. Nucl Acids Res. 2018;46(W1):W296-303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, et al. UCSF chimera—a visualization system for exploratory research and analysis. J Comput Chem. 2004;25(13):1605–12. [DOI] [PubMed] [Google Scholar]
- 78.Thumuluri V, Armenteros JeJA, Johansen AR, Nielsen H, Winther O. Deeploc 2.0: multi-label subcellular localization prediction using protein language models. Nucleic Acids Res. 2022;50:W228–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Zhu DT, Rao Q, Zou C, Ban FX, Zhao JJ, Liu SS. Genomic and transcriptomic analyses reveal metabolic complementarity between whiteflies and their symbionts. Insect Sci. 2022;29(2):539–49. [DOI] [PubMed] [Google Scholar]
- 80.Okamoto T, Kawaguchi K, Watanabe S, Agustina R, Ikejima T, Ikeda K, et al. Characterization of human ATP-binding cassette protein subfamily D reconstituted into proteoliposomes. Biochem Biophys Res Commun. 2018;496(4):1122–7. [DOI] [PubMed] [Google Scholar]
- 81.Moran NA. Tracing the evolution of gene loss in obligate bacterial symbionts. Curr Opin Microbiol. 2003;6(5):512–8. [DOI] [PubMed] [Google Scholar]
- 82.McCutcheon JP, Moran NA. Extreme genome reduction in symbiotic bacteria. Nat Rev Microbiol. 2012;10(1):13–26. [DOI] [PubMed] [Google Scholar]
- 83.Moran NA, Bennett GM. The tiniest tiny genomes. Annu Rev Microbiol. 2014;68(1):195–215. [DOI] [PubMed] [Google Scholar]
- 84.Boscaro V, Kolisko M, Felletti M, Vannini C, Lynn DH, Keeling PJ. Parallel genome reduction in symbionts descended from closely related free-living bacteria. Nat Ecol Evol. 2017;1(8):1160–7. [DOI] [PubMed] [Google Scholar]
- 85.Manzano-Marín A, Coeur d’acier A, Clamens A-L, Orvain C, Cruaud C, Barbe V, et al. A freeloader? The highly eroded yet large genome of the Serratia symbiotica symbiont of Cinara strobi. Genome Biol Evol. 2018;10(9):2178–89. [DOI] [PMC free article] [PubMed]
- 86.Hendry TA, de Wet JR, Dougan KE, Dunlap PV. Genome evolution in the obligate but environmentally active luminous symbionts of flashlight fish. Genome Biol Evol. 2016;8(7):2203–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Lane CE. Bacterial endosymbionts: genome reduction in a hot spot. Curr Biol. 2007;17(13):R508–10. [DOI] [PubMed] [Google Scholar]
- 88.Burke GR, Moran NA. Massive genomic decay in Serratia symbiotica, a recently evolved symbiont of aphids. Genome Biol Evol. 2011;3:195–208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Hendry TA, Freed LL, Fader D, Fenolio D, Sutton TT, Lopez JV. Ongoing transposon-mediated genome reduction in the luminous bacterial symbionts of deep-sea ceratioid anglerfishes. MBio. 2018;9(3):e01033–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.González-Pech RA, Shepherd J, Fuller Z, LaJeunesse TC, Parkinson JE. The genome of a giant clam zooxanthella (Cladocopium infistulum) offers few clues to adaptation as an extracellular symbiont with high thermotolerance. BMC Genomics. 2024;25:914. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Ankrah NYD, Chouaia B, Douglas AE, Wilson A, Ruby EG. The cost of metabolic interactions in symbioses between insects and bacteria with reduced genomes. MBio. 2018;9(5):e01433–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Sloan DB, Moran NA. Genome reduction and co-evolution between the primary and secondary bacterial symbionts of psyllids. Mol Biol Evol. 2012;29(12):3781–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Santos-Garcia D, Latorre A, Moya A, Gibbs G, Hartung V, Dettner K, et al. Small but powerful, the primary endosymbiont of moss bugs, Candidatus Evansia muelleri, holds a reduced genome with large biosynthetic capabilities. Genome Biol Evol. 2014;6(7):1875–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 94.Kashkouli M, Castelli M, Floriano AM, Bandi C, Epis S, Fathipour Y, et al. Characterization of a novel Pantoea symbiont allows inference of a pattern of convergent genome reduction in bacteria associated with Pentatomidae. Environ Microbiol. 2020;23(1):36–50. [DOI] [PubMed] [Google Scholar]
- 95.Santos-Garcia D, Rollat-Farnier P-A, Beitia F, Zchori-Fein E, Vavre F, Mouton L, et al. The genome of Cardinium cBtQ1 provides insights into genome reduction, symbiont motility, and its settlement in Bemisia tabaci. Genome Biol Evol. 2014;6(4):1013–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Oakeson KF, Gil R, Clayton AL, Dunn DM, von Niederhausern AC, Hamil C, et al. Genome degeneration and adaptation in a nascent stage of symbiosis. Genome Biol Evol. 2014;6(1):76–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Kobayashi I, Nobusato A, Kobayashi-Takahashi N, Uchiyama I. Shaping the genome – restriction–modification systems as mobile genetic elements. Curr Opin Genet Dev. 1999;9(6):649–56. [DOI] [PubMed] [Google Scholar]
- 98.Zheng H, Dietrich C, Hongoh Y, Brune A. Restriction-modification systems as mobile genetic elements in the evolution of an intracellular symbiont. Mol Biol Evol. 2015;33(3):721–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Thao ML, Baumann P. Sequence analysis of a DNA fragment from Buchnera aphidicola (aphid endosymbiont) containing the genes dapD-htrA-ilvI-ilvH-ftsL-ftsI-murE. Curr Microbiol. 1998;37(3):214–6. [DOI] [PubMed] [Google Scholar]
- 100.Neef A, Latorre A, Peretó J, Silva FJ, Pignatelli M, Moya A. Genome economization in the endosymbiont of the wood roach Cryptocercus punctulatus due to drastic loss of amino acid synthesis capabilities. Genome Biol Evol. 2011;3:1437–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Bao X-Y, Yan J-Y, Yao Y-L, Wang Y-B, Visendi P, Seal S, et al. Lysine provisioning by horizontally acquired genes promotes mutual dependence between whitefly and two intracellular symbionts. PLoS Pathog. 2021;17(11):e1010120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Wan PJ, Yang L, Wang WX, Fan JM, Fu Q, Li GQ. Constructing the major biosynthesis pathways for amino acids in the brown planthopper, Nilaparvata lugens Stål (Hemiptera: Delphacidae), based on the transcriptome data. Insect Mol Biol. 2014;23(2):152–64. [DOI] [PubMed] [Google Scholar]
- 103.Matsuura Y, Moriyama M, Łukasik P, Vanderpool D, Tanahashi M, Meng X-Y, et al. Recurrent symbiont recruitment from fungal parasites in cicadas. Proc Natl Acad Sci USA. 2018;115(26):E5970–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Zhou Z, White KA, Polissi A, Georgopoulos C, Raetz CRH. Function of Escherichia coli MsbA, an essential ABC family transporter, in lipid A and phospholipid biosynthesis. J Biol Chem. 1998;273(20):12466–75. [DOI] [PubMed] [Google Scholar]
- 105.Tikhonova EB, Devroy VK, Lau SY, Zgurskaya HI. Reconstitution of the Escherichia coli macrolide transporter: the periplasmic membrane fusion protein MacA stimulates the ATPase activity of MacB. Mol Microbiol. 2007;63(3):895–910. [DOI] [PubMed] [Google Scholar]
- 106.Henderson PJF, Maiden MCJ. Homologous sugar transport proteins in Escherichia coli and their relatives in both prokaryotes and eukaryotes. Philos Trans R Soc Lond B Biol Sci. 1990;326(1236):391–410. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data supporting the findings of this study are openly available in CNCB-NGDC [47]. The complete genome of *Portiera* BeAf is deposited in CNCB-NGDC under accession number GWHGGFS01000000.






