Skip to main content
The Plant Cell logoLink to The Plant Cell
. 2025 Oct 27;37(11):koaf259. doi: 10.1093/plcell/koaf259

Targeted genetic manipulation and yeast-like evolutionary genomics in the green alga Auxenochlorella

Rory J Craig 1,2,c,, Marco A Dueñas 3,#, Dimitrios J Camacho 4,#, Sean D Gallaher 5, Maria Clara Avendaño-Monsalve 6, Yang-Tsung Lin 7, Crysten E Blaby-Haas 8,9, Jeffrey L Moseley 10, Sabeeha S Merchant 11,12,13,14
PMCID: PMC12636532  PMID: 41140129

Abstract

Auxenochlorella spp. are diploid oleaginous green algae whose streamlined genomes can be readily manipulated by homologous recombination, making them highly amenable to discovery research and bioengineering. Vegetatively diploid organisms experience specific evolutionary phenomena, including allodiploid hybridization, mitotic recombination, loss-of-heterozygosity, and aneuploidy; however, studies of these forces have largely focused on yeasts. Here, we present a telomere-to-telomere phased diploid genome assembly of Auxenochlorella UTEX 250-A (haploid length 22 Mb) and introduce a genetic toolkit for site-specific manipulation of the nuclear genome in multiple strains, featuring several selectable markers, inducible promoters, and fluorescent reporters for protein localization. UTEX 250-A is an allodiploid hybrid of Auxenochlorella protothecoides and Auxenochlorella symbiontica, two species differentiated by extensive chromosomal rearrangements. UTEX 250-A haplotypes are a mosaic of each parental species following mitotic recombination, and two chromosomes are trisomic. Loss-of-heterozygosity events are pervasive across Auxenochlorella and can evolve rapidly in the laboratory. High-quality structural annotation yielded ∼7,500 genes per haplotype. Auxenochlorella have experienced gene family loss and reduction, including core photosynthesis genes, and exhibit periodic adenine and cytosine methylation at promoters and gene bodies, respectively. Approximately 10% of genes, especially those involved in DNA repair and sex, overlap antisense long noncoding RNAs, which may participate in a regulatory mechanism. We demonstrate the utility of Auxenochlorella for fundamental research by knockout of a chlorophyll biosynthesis enzyme, and confirm one trisomy by allele-specific transformation. These results demonstrate the generality of several evolutionary forces associated with vegetative diploidy and provide a foundation for the use of Auxenochlorella as a reference organism.


Auxenochlorella, green algae shaped by evolutionary forces acting on vegetative diploids, are amenable to discovery research and bioengineering via efficient site-specific homologous recombination.

Introduction

Green algae present excellent opportunities for discovery in plant biology, especially concerning fundamental molecular and cellular processes. The primary algal model Chlamydomonas reinhardtii has been central to research in photosynthesis and chloroplast biogenesis, in addition to functions and processes that are broadly relevant to eukaryotic cells, such as cilia/ciliogenesis and the cell cycle (Sasso et al. 2018; Salome and Merchant 2019). As photosynthetic organisms that can utilize atmospheric carbon dioxide in the production of high-value bioproducts, including specialty chemicals and nutraceuticals, green algae also hold great potential in biotechnology (Goold et al. 2024). The ability to edit, add, or remove genes is essential to both of these endeavors. Despite the high growth rates and experimental tractability of several algae, the tools to genetically manipulate algal genomes have typically lagged behind those available for many bacteria and yeasts (Sproles et al. 2021). In C. reinhardtii, transgenes are typically integrated at untargeted locations via nonhomologous end-joining, although recent advances in ribonucleoprotein-mediated approaches have enabled site-specific manipulation (Nievergelt 2025). Nontargeted transgene integration and CRISPR-Cas9 gene editing are also possible in species from industrially relevant trebouxiophyte genera such as Chlorella and Picochlorum (Yang et al. 2016; Lin and Ng 2020; Krishnan et al. 2025). However, efficient site-specific transformation by homologous recombination, as is commonplace in the budding yeast Saccharomyces cerevisiae (Orr-Weaver et al. 1981), is incredibly rare in green algae, having been reported sparingly in species like the ecologically relevant prasinophyte Ostreococcus tauri (Lozano et al. 2014).

Auxenochlorella is a genus of trebouxiophyte green algae in which homologous recombination is efficient. Site-specific transformation of the nuclear genome in Auxenochlorella protothecoides, as well as its close relatives from the genus Prototheca, has been reported in patent filings (Franklin et al. 2013; Moseley et al. 2024). While Prototheca are obligate heterotrophs that have lost the ability to photosynthesize, Auxenochlorella can grow robustly as a phototroph, mixotroph, or heterotroph, undergoing a metabolic switch that degrades the photosynthetic apparatus when provided with an organic carbon source (Matsuka et al. 1966). Therefore, it is facile to study photosynthetic gene function through reverse genetics, making Auxenochlorella an attractive model in plant biology. Both Auxenochlorella and Prototheca are oleaginous, and Auxenochlorella oil and biomass have generally recognized as safe (GRAS) status, favoring their use for engineering of specific lipids for bioproducts and food applications, in addition to biofuels (Franklin et al. 2013; Brooks et al. 2024; Goold et al. 2024). We recently demonstrated the utility of Auxenochlorella for discovery research in a study requiring introduction and quantitation of gene expression from several synthetic gene constructs to deduce the mechanism underlying the translation of bicistronic genes (Dueñas et al. 2025a), which are widespread in green algae (Gallaher et al. 2021).

Like budding yeast, Auxenochlorella are vegetatively diploid. Microbial organisms that undergo periods of asexual reproduction as diploids are associated with several molecular and evolutionary phenomena. These processes are best studied in S. cerevisiae and its close relatives, which undergo a facultatively sexual life cycle featuring long periods of asexual division as a diploid (Tsai et al. 2008; Fischer et al. 2021). Although inter-tetrad outcrossing is relatively rare in yeast, one interesting facet of this life cycle is occasional alloploid hybridization. Mating of haploid gametes from divergent lineages produces an allodiploid hybrid that inherits one set of chromosomes from each parent, whereas allotetraploid hybrids can arise via conjugation of divergent diploid cells or whole-genome duplication of an allodiploid individual (Sipiczki 2008; Gabaldón 2020). Although allodiploid hybrids may be incapable of meiosis due to incompatibilities between the parental chromosomes, they can persist via asexual division (Gabaldón 2020). Indeed, hybrids may exhibit increased fitness or unique phenotypes relative to their parents (i.e. heterosis), and hybrid Saccharomyces strains have successfully adapted to anthropomorphic environments such as alcoholic ferments (Sipiczki 2008; Peris et al. 2018; Gallone et al. 2019; Langdon et al. 2019) and olive brine (Pontes et al. 2019). Hybridization has also been linked to the evolution of pathogenicity in taxa including Candida yeasts (Pryszcz et al. 2015; Schroder et al. 2016) and Aspergillus filamentous fungi (Steenwyk et al. 2020). Importantly, allodiploid hybridization is not necessarily an evolutionary dead end, since the correct pairing of homologous chromosomes and sexual compatibility can be restored via whole-genome duplication (Charron et al. 2019). Exemplifying this process, an ancient hybridization event is thought to have preceded the whole-genome duplication in the lineage leading to Saccharomyces (Marcet-Houben and Gabaldón 2015). Allopolyploidy (hybridization and genome duplication) has also long been recognized as a major force in plant evolution and speciation (Soltis and Soltis 2009).

Although recombination is primarily associated with meiotic cell division, various mitotic recombination processes can occur during asexual growth. In S. cerevisiae, mitotic recombination occurs randomly due to double-strand breaks at a rate that is orders of magnitude lower than meiotic recombination (Mancera et al. 2008; Sui et al. 2020). Nonetheless, over lengthy periods of asexuality, mitotic recombination can drastically homogenize genomes via loss-of-heterozygosity (LOH) events, where an allele of one chromosome is replaced by the allele carried by its homologous chromosome (Dutta and Schacherer 2025). LOH can occur at short regions along chromosomes of generally less than 10 kb via gene conversion (i.e. interstitial LOH), or at larger regions that typically extend from an internal location to the telomere (terminal LOH), which are the products of mitotic crossovers or the break-induced replication DNA repair pathway (Sui et al. 2020; Dutta et al. 2021). While LOH can negatively impact fitness (e.g. by unmasking recessive deleterious mutations), it is also an important mechanism by which allelic incompatibilities are purged in hybrids (Smukowski Heil et al. 2017; Lancaster et al. 2019). Furthermore, reciprocal mitotic crossovers result in the shuffling of haplotypes between the parental sub-genomes of hybrids, producing chromosomes that are a mosaic of the ancestral parents (Sipiczki 2008). Mitotic recombination has also been characterized in diploid oomycetes (Dale et al. 2019) and diatoms (Bulankova et al. 2021), suggesting that it is an important evolutionary process across the tree of life.

Another phenomenon associated with asexual reproduction is the emergence of aneuploidy via the loss or gain of copies of specific chromosomes. Unbalanced chromosome numbers have been observed in many eukaryotes and are prevalent in several fungi (Vande Zande et al. 2023). As with hybridization, aneuploidy and its subsequent effects (e.g. gene expression changes) are associated with adaptation to new environments and pathogenicity, although negative fitness effects are also common (Santaguida and Amon 2015). For example, whole or partial aneuploidies are common among clinical isolates of Candida albicans (Hirakawa et al. 2015), and an aneuploidy affecting a specific chromosome arm causes antifungal drug resistance (Selmecki et al. 2006). Under laboratory evolution of C. albicans, chromosomal duplications have also been observed as transient adaptations to abrupt heat stress (Yona et al. 2012). Transient aneuploidy can result in LOH at the scale of entire chromosomes via duplication of one chromosome copy and loss of the other (Yuen et al. 2007; Andersen et al. 2008).

These drivers of evolution have received scarce attention in chlorophyte algae. Although facultatively sexual, C. reinhardtii (class Chlorophyceae) grows asexually as a haploid, with the diploid stage usually limited to a dormant zygospore that obligately divides by meiosis (Harris 2001). Similarly, asexual reproduction in Chlorella (Trebouxiophyceae) is haploid, with a possible cryptic, uncharacterized sexual cycle (Blanc et al. 2010; Fučiková et al. 2015). Ancient mating type loci have also been identified in haploid Ostreococcus species (Mamelliophyceae), demonstrating the presence of haplontic life cycles at the deepest evolutionary branch of the Chlorophyta (Benites et al. 2021). Despite this, several vegetatively diploid species have recently been reported (Fig. 1A). Diploid genome assemblies have been assembled for trebouxiophyte algae from the related genera Nannochloris (Sanders et al. 2022) and Picochlorum (Foflonker et al. 2018; Becker et al. 2020; Barten et al. 2022; da Roza et al. 2024), and for the lichen-forming Trebouxia lynniae (Gazquez et al. 2024). Chloropicon primus (Chloropicophyceae) is trisomic (3 copies) for a single chromosome and is otherwise diploid (Lemieux et al. 2019). In the Chlorophyceae, diploids from the genera Haematococcus (Volvocales) and Tetradesmus (Sphaeropleales) have been reported (Calhoun et al. 2021; Marcolungo et al. 2024). Most notably, Biondi et al. (2024) described a diploid Tetradesmus obliquus strain with extensive heterozygosity but one entirely homozygous chromosome, patterns potentially consistent with hybridization followed by chromosome-scale LOH. These results suggest that the presence of diplontic life cycles may be underestimated in Chlorophyta, although the prevalence of hybridization, LOH, and aneuploidy is presently unknown.

Figure 1.

Figure 1.

Phylogenetic relationships of Auxenochlorella. A) Maximum likelihood phylogeny based on concatenated protein alignment of 669 single-copy orthologs, using the LG + F + R5 substitution model. All nodes received ultrafast bootstrap values >95%, except the node connecting Picochlorum and Nannochloris to Chlorella and Micractinium (68% ultrafast bootstrap support). Putative transitions to at least partial vegetative diploidy are shown by the “2n” symbol. Divergence estimates were taken from Del Cortona et al. (2020). B) Maximum likelihood phylogeny based on 18S ribosomal DNA alignment, using the TNe + I substitution model. See Supplementary Data Set 1 for strain metadata.

Here, we present a fully phased, telomere-to-telomere diploid genome assembly for the Auxenochlorella strain UTEX 250-A. We find an unusual genome architecture where 6 of the 12 chromosome pairs are highly rearranged and nonhomologous. We demonstrate that UTEX 250-A is an allodiploid hybrid of Auxenochlorella protothecoides and Auxenochlorella symbiontica, which each differ in their respective karyotypes. Despite the high heterozygosity introduced by hybridization, several local genomic regions and three entire chromosomes are homozygous following extensive LOH, while two other chromosomes are trisomic. We introduce a highly curated structural annotation of the genome, reporting a streamlined nuclear genome architecture and gene content. Finally, we present a toolkit for genetic manipulation, featuring multiple selectable markers and fluorescent protein expression for intracellular localization studies. We demonstrate application of these tools for reverse genetics via targeted disruption of a gene encoding a chlorophyll biosynthesis enzyme in UTEX 250-A and strains of both parental species, and experimentally verify one UTEX 250-A trisomy by allele-specific targeting of a reporter construct. These resources demonstrate that a green algal genome can be shaped by forces experienced by diplontic taxa like yeasts, and provides a platform for the use of Auxenochlorella as a facile reference organism in plant biology and bioengineering.

Results

The genus Auxenochlorella and existing genomic data

First isolated in the 1890s as “Chlorellaprotothecoides (Krüger 1894a; Krüger 1894b), Auxenochlorella was later established as an independent monotypic genus based on physiological, morphological, and genetic data (Kalina and Punčochárová 1987; Huss et al. 1999). Indeed, Auxenochlorella and true Chlorella species likely diverged more than 550 million years ago (Del Cortona et al. 2020) and form distinct lineages within the taxonomically diverse order Chlorellales (Fig. 1A). The closest relatives of Auxenochlorella have all lost photosynthesis; Prototheca is a polyphyletic assemblage of heterotrophic free-living and opportunistically parasitic algae, and Helicosporidium are obligate parasites of arthropods (Jagielski et al. 2019). Collectively, these algae form the “AHP lineage” (Ueno et al. 2005), and phylogenetic analyses support multiple independent transitions to heterotrophy, with Auxenochlorella enduring as the sole photosynthetic lineage (Figueroa-Martinez et al. 2015; Suzuki et al. 2018; Guo et al. 2022).

Several isolates classified as A. protothecoides are maintained in culture centers, including the original strain isolated by W. Krüger from the sap of a poplar tree (UTEX 25 at the University of Texas Culture Collection of Algae). Auxenochlorella have also been isolated from fresh and salt water, soil, and as symbionts of freshwater hydrozoans and sponges (Fig. 1B). Based on ribosomal DNA (rDNA) analysis, Darienko and Pröschold (2015) described a second species, A. symbiontica, with the type strain CCAP 211/61 isolated from Hydra viridis. We produced an 18S rDNA phylogeny including some additional isolates (Supplementary Data Set 1), which supports an association between A. protothecoides and both tree sap and freshwater habitats, whereas most A. symbiontica strains were isolated from marine environments (Fig. 1B). However, the marine strain UTEX 2341 groups with A. protothecoides and freshwater symbionts have been isolated for both species, suggesting that both A. protothecoides and A. symbiontica can live in multiple and overlapping environments.

Gao et al. (2014) produced the first draft assembly of A. protothecoides “0710” (=UTEX 25, see below), while additional draft assemblies of UTEX 25 (Vogler et al. 2018) and UTEX 2341 are available at NCBI (Supplementary Data Set 2). Several draft assemblies of Prototheca and Helicosporidium have been produced (Pombert et al. 2014; Severgnini et al. 2018; Suzuki et al. 2018; Bakuła et al. 2021; Guo et al. 2022). Although these assemblies are all collapsed to haploids, Guo et al. (2022) reported heterozygosity in two Prototheca wickerhamii genomes. Every available AHP lineage genome is highly streamlined relative to the ∼40 to 60 mb genomes of Chlorella species; e.g. the A. protothecoides 0710 assembly has a haploid length of 22.9 mb, with ∼7,000 predicted genes and low repeat content (Gao et al. 2014) (Table 1). It is noteworthy that Picochlorum and Nannochloris species also possess diploid streamlined genomes (Table 1), and phylogenetic placement of these species as a sister lineage of Chlorella (including Micractinium) received low bootstrap support in our analysis (Fig. 1A). Whether the transition to a diplontic lifecycle occurred independently in the AHP lineage and Picochlorum/Nannochloris, or whether these lineages may in fact form sister groups, is an outstanding question.

Table 1.

Summary statistics of representative trebouxiophyte genomes with C. reinhardtii as an outgroup

Genome Size (Mb) Genes TE genes lncRNA genes % GC % Repeats Reference
Auxenochlorella UTEX250-A
 Haplotype A 22.0 7,509 86 547 63.9 6.22 This study
 Haplotype B 22.0 7,515 121 574 63.9 6.21 This study
Auxenochlorella protothecoides 0710 22.9 6,988 26 N/A 63.6 5.58 Gao et al. (2014)
Prototheca cutis JCM15793 20.0 5,600 5 N/A 61.3 5.84 Suzuki et al. (2018)  Kwon et al. (2023)
Chlorella vulgaris 211/11P 40.2 10,341 357 N/A 61.8 13.1 Cecchin et al. (2019)
Chlorella sorokiniana UTEX 1602 59.4 9,489 37 N/A 64.1 12.9 Arriola et al. (2018)
Micractinium conductrix SAG 241.80 60.8 9,027 190 N/A 67.3 21.5 Arriola et al. (2018)
Nannochloris desiccata UTEX 2526 21.6 9,123 187 N/A 45.0 8.12 Sanders et al. (2022)
Picochlorum sp. BPE23 (haplotype A) 14.8 7,216 11 N/A 46.3 9.72 Barten et al. (2022)
Coccomyxa subellipsoidea v3 48.9 10,836 57 N/A 52.9 6.04 Blanc et al. (2012)
Chlamydomonas reinhardtii v6.1 114.0 16,801 810 N/A 64.1 26.4 Craig et al. (2023

See methods for determination of transposable element (TE) genes included in annotations; note that these numbers are not representative of the true number of TE genes in each genome due to differences in annotation methodology.

lncRNA, long noncoding RNA; N/A, not applicable.

Telomere-to-telomere, phased diploid assembly of Auxenochlorella UTEX 250-A

To serve as a foundation for discovery research and bioengineering, we sought to produce a reference-quality genome assembly of Auxenochlorella. We selected the strain UTEX 250 based on its robust performance in high cell density fermentations (Brooks et al. 2024). Metadata on UTEX 250 are sparse; the strain was isolated before 1950 in the Netherlands by A. J. Kluyver, possibly from freshwater. Although described as A. protothecoides, UTEX 250 groups with A. symbiontica based on 18S rDNA (Darienko and Pröschold 2015) (Fig. 1B). Since our line of the strain has been independently maintained for several years after acquisition from UTEX, we performed single colony purification prior to genome sequencing, and we designate this sub-clone “UTEX 250-A.”

We produced high-coverage Pacific Biosciences (PacBio) HiFi and linked-read OmniC datasets for UTEX 250-A, from which we produced a genome assembly via automated and manual assembly (see Methods). This effort yielded a telomere-to-telomere diploid genome assembly comprising 2 haplotypes, each featuring 12 gapless nuclear chromosomes spanning ∼22.0 mb (i.e. collectively 24 nuclear chromosomes spanning 44.0 mb) (Table 1). Heterozygosity between the 2 haplotypes is 2.71% (i.e. homologous chromosomal regions differ by ∼27 single nucleotide polymorphisms per 1 kb; heterozygosity calculations exclude entirely homozygous regions, see below). This extensive inter-haplotype variation made it possible to fully phase the assembled chromosomes using the accurate PacBio reads, so that each haplotype corresponds to a physical chromosome in UTEX 250-A. We arbitrarily label these haplotypes “A” and “B.” As presented below, only 6 of the 12 chromosome pairs exhibit one-to-one homology, and the remaining 6 are highly rearranged between the 2 haplotypes.

All chromosomes terminate in the telomeric repeat (TTAGGG)n, except for one chromosome arm per haplotype that terminates in a truncated rDNA array. This telomeric motif differs from the (TTTAGGG)n repeat found in the trebouxiophyte genera Chlorella and Coccomyxa, which is likely ancestral to green algae, suggesting an independent transition to (TTAGGG)n repeats as observed in some other green algae (Fulnečková et al. 2012). The 5S rDNA genes, which generally form tandem arrays independent of the major rDNA arrays, are present as 16 (haplotype A) or 17 (haplotype B) standalone genes dispersed throughout the genome. Similar nontandem 5S rDNA genes have been observed in species including the yeasts Schizosaccharomyces pombe and Yarrowia lipolytica (Torres-Machorro et al. 2010). An initial search for tRNA genes using tRNAscan-SE (Chan et al. 2021) yielded an incomplete set, and we subsequently detected permuted tRNA genes in which the 5′ and 3′ halves of the tRNA gene are organized in reverse (Supplementary Fig. S1). This unusual structure was previously reported in the genomes of the red alga Cyanidioschyzon merolae (Soma et al. 2007) and prasinophyte green algae (Maruyama et al. 2010).

As in A. protothecoides 0710, repeat content is low, with interspersed and tandem repeats spanning 3.4% and 2.8% of the genome, respectively (Table 1). Among the interspersed repeats, we identified potentially active families of Metaviridae long terminal repeat (LTR) retrotransposons, and hAT and EnSpm-Plavaka cut-and-paste DNA transposons, with many insertions segregating as polymorphisms between the 2 haplotypes (Supplementary Fig. S2, Supplementary Data Set 3). The most abundant repeat is an uncharacterized element that encodes a Fanzor protein, which are RNA-guided endonucleases carried by diverse transposons in eukaryotes and their giant viruses (Saito et al. 2023). Phylogenetically diverse Fanzors are sporadically found in green algal genomes, including C. reinhardtii (Bao and Jurka 2013) and Chloropicon primus (Yoon et al. 2023), and presumably arise via lateral gene transfer.

Green algal centromeres identified thus far are transposon-enriched regions of tens to hundreds of kilobases in species of Chlorella, Coccomyxa, and Chlamydomonas (Blanc et al. 2012; Craig et al. 2021; Wang et al. 2024). In line with the paucity of repeats, we found no obvious repeat-rich centromere candidates in UTEX 250-A. We searched for short regional centromeres, nonrepetitive AT-rich regions of 1 to 5 kb as found in C. merolae (Kanesaki et al. 2015) and the diatom Phaeodactylum tricornutum (Diner et al. 2017). On 9 of the 12 chromosomes, we identified single AT-rich (mean 48.0% GC, relative to 63.9% GC genome-wide) intergenic regions spanning 2.1 to 7.6 kb (Supplementary Fig. S3, Supplementary Data Set 4). Although ChIP-sequencing of the centromeric histone variant will be required to confirm centromere locations, considered alongside the low repeat content and nontandem 5S rDNA genes, these candidates support a highly streamlined architecture of the UTEX 250-A genome.

We produced complete circular genome assemblies for the plastome (84.6 kb, 30.8% GC; Supplementary Fig. S4) and mitogenome (54.0 kb, 29.0% GC; Supplementary Fig. S5). The UTEX 250-A plastome is broadly consistent with existing assemblies from A. protothecoides UTEX 25 (Yan et al. 2015; Park et al. 2022). Briefly, the plastome features 78 protein-coding genes and lacks internal repeats, introns, and trans-spliced genes, resulting in a more compact architecture relative to most trebouxiophyte plastomes. The mitogenome features 37 protein-coding genes, including two LAGLIDADG homing endonucleases encoded by introns in cox1, as also observed in the P. wickerhamii mitogenome (Wolff et al. 1993).

Auxenochlorella UTEX 250 is a putative allodiploid hybrid

Considering the high heterozygosity and rearranged nature of half of the UTEX 250-A diploid chromosomes, we speculated that UTEX 250 may be an allodiploid hybrid between two distinct parental species that possess different karyotypes. Given the close evolutionary relationship between A. protothecoides and A. symbiontica (Fig. 1B), we explored this hypothesis via genome sequencing of strains of both species. We assembled gapless chromosome-level assemblies for A. protothecoides strains UTEX 25 and UTEX 2341, and A. symbiontica CCAP 211/61, from PacBio HiFi data (supplemented by Oxford Nanopore Technologies, ONT, sequencing for CCAP 211/61; see Methods). The three assemblies were broadly consistent with our UTEX 250-A assembly; each nuclear genome comprises 12 diploid chromosomes, with haploid genome sizes ranging from 21.9 mb to 23.0 mb (Supplementary Data Set 5). However, unlike UTEX 250-A, we observed no rearrangements between the homologous chromosome pairs in each genome, implying a typical diploid karyotype in these strains.

We first considered the 6 homologous UTEX 250-A chromosomes. Each of these 6 chromosomes corresponds to a pair of homologous chromosomes in UTEX 25, UTEX 2341, and CCAP 211/61, suggesting that they are ancestral to Auxenochlorella. In UTEX 250-A, three of these chromosomes (1, 2, and 6) are mostly heterozygous, whereas the remaining chromosomes (4, 9, and 10) are entirely homozygous (Fig. 2A). We presume that the homozygous chromosomes were originally heterozygous, with homozygosity arising following chromosome-scale LOH events (see below). As introduced, heterozygosity in UTEX 250-A is ∼2.7%, although many local genomic regions are homozygous (e.g. the ∼73 kb homozygous region at the left arm terminus of chromosome 1, Fig. 2A). In A. protothecoides, heterozygosity is substantially lower at 0.59% (UTEX 25) and 0.62% (UTEX 2341) in heterozygous regions, with both genomes also exhibiting extensive regional homozygosity. Average heterozygosity in CCAP 211/61 is 1.35% in heterozygous regions, implying that A. symbiontica harbors greater genetic diversity than A. protothecoides.

Figure 2.

Figure 2.

Allodiploid hybrid origin of UTEX 250. A) Heterozygosity and divergence for four Auxenochlorella strains in 10 kb windows across the 6 UTEX 250-A chromosomes with one-to-one homology. For divergence metrics the upper panels refer to haplotype A, lower panels to haplotype B. B) Genomic rearrangements among the remaining 6 diploid UTEX 250-A chromosomes. C) Comparison between rearranged UTEX 250-A chromosomes and diploid A. protothecoides and A. symbiontica chromosomes. D) Neighbor-joining phylogeny of all Auxenochlorella haplotypes (all bootstrap values 100%). E) Neighbor-joining phylogeny of plastome sequences (all bootstrap values 100%). F) Neighbor-joining phylogeny of mitogenome sequences (all bootstrap values 100%).

When comparing the A and B haplotypes of the UTEX 250-A chromosomes to the other Auxenochlorella strains, we observed substantial inter-haplotype variation in genetic divergence (Fig. 2A). For example, on chromosome 2, haplotype A differs on average by 0.57% and 0.49% relative to the homologous chromosomes of UTEX 25 and UTEX 2341, respectively. Meanwhile, haplotype B differs by 2.69% (UTEX 25) and 2.66% (UTEX 2341). Therefore, for chromosome 2, the divergence between haplotype A and the A. protothecoides genomes is comparable to heterozygosity between chromosome pairs in A. protothecoides, whereas the divergence between haplotype B and A. protothecoides is comparable to heterozygosity in UTEX 250-A. The reciprocal pattern is observed via comparison to A. symbiontica; divergence between haplotype B and CCAP 211/61 (1.64%) is comparable to heterozygosity in CCAP 211/61, whereas divergence between haplotype A and CCAP 211/61 (2.84%) is comparable to UTEX 250-A heterozygosity. These results are consistent with haplotype A of chromosome 2 originating from an A. protothecoides-like ancestor, and haplotype B originating from an A. symbiontica-like ancestor. Extending this logic, it is possible to use genetic divergence to assign a putative parent of origin to all genomic regions in UTEX 250-A (Fig. 2, A and B).

In contrast to chromosome 2, we observed intrahaplotype switches in assigned parentage on the remaining five homologous chromosome pairs (Fig. 2A). On chromosome 1, a single switch corresponds to the 73 kb homozygous region at the left arm terminus, and can be attributed to a LOH event in which the ancestral haplotype A copy of this region was replaced by haplotype B, resulting in both resembling A. protothecoides. Conversely, the switches on chromosome 6 are reciprocal, resulting in a ∼360 kb region of A. protothecoides-like sequence on an otherwise A. symbiontica-like chromosome for haplotype A (and vice versa on haplotype B). This mosaicism could be explained by mitotic crossover following hybridization, i.e. each haplotype originally corresponded to one parent (as for chromosome 2), but has subsequently undergone reciprocal exchange. The homozygous chromosomes 4, 9, and 10 also exhibit switches in assigned parentage that are consistent with either LOH or crossover events occurring in a heterozygous state that preceded complete LOH. Note that the A and B labels are arbitrary, and the genomic mosaicism makes it impossible to correspond physical haplotypes to parents of origin in UTEX 250-A.

We next tested whether the UTEX 250-A rearranged chromosomes correspond to the ancestral karyotypes of A. protothecoides and A. symbiontica. The extent of the rearrangements among these chromosomes can only be explained by multiple translocations, one fusion, and one fission (Fig. 2B). Remarkably, 8 of the 12 chromosome copies correspond exactly to a homologous chromosome pair in either A. protothecoides (both UTEX 25 and UTEX 2341, which have an identical karyotype) or A. symbiontica (CCAP 211/61). Specifically, UTEX 250-A chromosomes B05 (i.e. chromosome 5, haplotype B), A08, and A11 correspond to homologous chromosome pairs in A. protothecoides, whereas A05, A07, B08, B11, and A12 correspond to homologous chromosome pairs in A. symbiontica (Fig. 2C). Therefore, most of the rearrangements in UTEX 250-A can be explained by direct inheritance from parental species with different karyotypes. The remaining chromosomal rearrangements can be explained by events that occurred following hybridization. The UTEX 250-A chromosomes B03 and B12 correspond to a pair of A. protothecoides chromosomes, suggesting fission of the ancestral A. protothecoides chromosome 3 and subsequent fusion of one of the resulting chromosomal fragments (which is predicted to have lacked a centromere) to the ancestral A. protothecoides chromosome 12 (Fig. 2C). Interestingly, LTR elements are present at both the de novo terminus of B03 and the fusion site of B12, suggesting that retrotransposons may have invaded (and potentially stabilized) the unprotected chromosomal ends following fission. Finally, the UTEX 250-A chromosomes A03 and B07 are consistent with a reciprocal translocation between the ancestral A. protothecoides chromosome 7 and A. symbiontica chromosome 3 (Fig. 2C). These ancestral chromosomes feature a common locus of ∼6 kb that corresponds to two genes that are part of the largest Auxenochlorella-specific gene family (see below), suggesting that the translocation occurred via ectopic recombination between these nonhomologous repeats. It is also noteworthy that switches in the assigned parent of origin are frequently observed on the rearranged chromosomes (Fig. 2, B and C), implying that mitotic recombination has occurred between homologous regions despite the rearrangements.

Finally, we constructed a phylogeny based on genomic regions that: (i) are present on UTEX 250-A chromosomes that have not obviously experienced mitotic crossovers, and (ii) are not affected by LOH in any of the four strains. As expected, this analysis clearly divided the UTEX 250-A genome into regions that clustered with either UTEX 25 and UTEX 2341 (A. protothecoides-like) or CCAP 211/61 (A. symbiontica-like), with the longer branch lengths for A. symbiontica reflecting the higher genetic diversity in the species (Fig. 2D). We also constructed trees for the plastome (Fig. 2E) and mitogenome (Fig. 2F), which demonstrated putative inheritance from A. symbiontica in both cases.

Overall, A. protothecoides and A. symbiontica represent authentic species that are ∼2.7% divergent at the sequence level and are distinguished at the chromosome level by multiple rearrangements. Assuming that the sequenced strains are representative, genetic diversity may be more than 2-fold higher in A. symbiontica than A. protothecoides. UTEX 250 is a putative allodiploid hybrid that presumably originated via the fusion of haploid cells of A. symbiontica and A. protothecoides. The ancestrally inherited haplotypes have subsequently been mixed via nonreciprocal and reciprocal mitotic exchange, and 4 ancestrally inherited chromosomes have been involved in post-hybridization rearrangements. Given the efficiency of homologous recombination in Auxenochlorella (see below), it is possible that the rapid emergence of chromosomal differences between the closely related A. protothecoides and A. symbiontica may have been driven by ectopic recombination events. Indeed, assuming a sexual life cycle for Auxenochlorella, UTEX 250-A is presumably incapable of meiosis due to the nonhomologous nature of its chromosomes (Garagna et al. 2014), and the karyotypic variation between the two species may present a barrier to gene flow.

Pervasive loss-of-heterozygosity and evolution in the laboratory

LOH events are associated with several mechanisms and can affect local genomic regions (interstitial LOH), large regions that extend to a chromosome terminus (terminal LOH), or entire chromosomes. We observed all 3 LOH categories in UTEX 250-A, resulting in at least 35.5% of the genome being homozygous (conservatively assuming a minimum interstitial LOH length of 1 kb). The three entirely homozygous chromosomes, 4, 9, and 10, collectively span 28.2% of the genome. Terminal LOH events affect 7 chromosomal termini (e.g. the left arm terminus of chromosome A01, Fig. 2A), ranging from ∼12 kb to ∼271 kb (mean 111 kb) and collectively spanning 3.5% of the genome. We observed 278 putative interstitial LOH events, 36 of which exceed 5 kb in length, reaching a maximum of 37.1 kb. Notably, LOH is pervasive in all of the sequenced Auxenochlorella genomes; 23.3% of the A. symbiontica CCAP 211/61 genome is homozygous, whereas levels of homozygosity reach 67.2% (UTEX 25) and 80.3% (UTEX 2341) in A. protothecoides.

Since all the studied Auxenochlorella strains have been in culture for decades, we questioned whether LOH could have accumulated in the laboratory. UTEX 25 is one of the oldest microbes in culture, having been isolated in the late 19th century (Krüger 1894a; Krüger 1894b). CCAP 211/61 was isolated in 1982, and UTEX 2341 prior to 1984 (Seto et al. 1984). Conveniently, many of these strains are maintained at different culture centers, including UTEX and CCAP (Culture Collection of Algae and Protozoa). Although records are uncertain, a sister lineage of UTEX 250 has likely been independently maintained for more than 70 years as CCAP 211/7C (Fig. 3A). To test whether LOH has occurred during culture, we obtained CCAP 211/7C and performed high-coverage ONT sequencing.

Figure 3.

Figure 3.

Loss-of-heterozygosity and evolution in the laboratory. A) Putative strain history of UTEX 250. IFE = Institute of Freshwater Ecology, SAMS = Scottish Association for Marine Science, PBI = Phycoil Biotechnology International, Inc. B) Heterozygosity in 10 kb windows across 3 example chromosomes. Chromosomes are shaded according to parental species of origin, and candidate centromeres are marked where available. LOH = loss-of-heterozygosity. Different LOH categories in UTEX 250-A are shown by colored boxes.

Remarkably, LOH events affect only 19.7% of the CCAP 211/7C genome. Although chromosome 10 is entirely homozygous in both UTEX 250-A and CCAP 211/7C, chromosomes 4 and 9 are heterozygous in CCAP 211/7C (Fig. 3B, Supplementary Fig. S6). Thus, two of the chromosome-scale LOH events must have occurred during culture at UTEX or in our laboratory. Similarly, of the 7 UTEX 250-A terminal LOH events, only 5 are present in CCAP 211/7C (Supplementary Fig. S6). For example, both terminal LOH events on chromosome A07 and the right arm terminal LOH event on A08 are shared, whereas the event on the left arm terminus of A08 is unique to UTEX 250-A (Fig. 3B). The same pattern is observed for interstitial LOH, e.g. the largest interstitial event in the genome (37.1 kb at ∼1.66 mb on chromosome A08) is shared, whereas the second largest (29.2 kb, ∼0.69 mb on A07) is unique to UTEX 250-A (Fig. 3B). Interestingly, almost all of the strain-specific LOH events are unique to UTEX 250-A, suggesting that the frequency of LOH during culture at CCAP is substantially lower.

These results support the rapid accumulation of LOH events under at least some laboratory conditions, as has been observed experimentally in yeast (Sui et al. 2020; Dutta et al. 2021). Contrastingly, the UTEX 250-A chromosomal rearrangements that likely followed hybridization are also present in CCAP 211/7C (Supplementary Fig. S6), suggesting that these events occurred prior to sampling of the original strain (or at least before the UTEX and CCAP lineages were established). Although it is possible that some of the LOH events in UTEX 250-A and CCAP 211/7C were fixed by positive selection to overcome posthybridization incompatibles, there is no obvious directionality with respect to parental species: approximately equal numbers of bases within regions affected by interstitial and terminal LOH events can be assigned to A. protothecoides and A. symbiontica. Furthermore, LOH is even more prominent in the nonhybrid A. protothecoides strains, suggesting that it is a widespread phenomenon in Auxenochlorella. It is possible that the different frequencies of LOH events in CCAP 211/7C and UTEX 250-A could be attributed to differences in culture conditions (e.g. liquid vs agar cultures that would affect clonal population sizes). We recommend cryopreservation and avoiding bottlenecks in culture sizes to avoid the fixation of LOH events by drift.

Aneuploidy is widespread in Auxenochlorella

While assembling the UTEX250-A genome, we noticed that two genomic regions exhibited unexpectedly high read coverage. Specifically, when mapping the PacBio reads against the diploid assembly, the entirety of chromosome B03 and a 932 kb fragment of B05 exhibit median read coverages of 528× and 519×, respectively, approximately twice the 272× coverage of the other individual chromosomes, suggesting that these regions are trisomic when considering both A and B haplotypes (Fig. 4A). The putatively duplicated part of B05 features the right arm terminus and associated telomere, and we identified reads featuring telomeric repeats that map internally to B05 precisely at the location where coverage doubles (Supplementary Fig. S7). Conceivably, B05 may have duplicated and then undergone fission, with the fragment retaining the centromere acquiring a de novo telomere at its left terminus, yielding a partly aneuploid chromosome. We compared read coverage between UTEX 250-A and CCAP 211/7C to explore whether the observed aneuploidies could have evolved during laboratory culture. Surprisingly, B05 exhibits normal coverage in the CCAP 211/7C ONT read dataset, whereas B03 has 3 times the expected coverage (163× relative to 53× genome-wide, Fig. 4B), suggesting tetrasomy for this chromosome (3 copies of B03 and 1 copy of the homologous region on A03; Supplementary Fig. S6). Thus, the B05 duplication appears to have occurred during culture leading to UTEX 250-A (or alternatively CCAP 211/7C has reverted to disomy), whereas aneuploidy of B03 may have existed prior to sampling, with subsequent copy number change during culture.

Figure 4.

Figure 4.

Aneuploidy in Auxenochlorella.  A) UTEX 250-A PacBio read coverage for 20-kb windows across the phased diploid genome assembly of UTEX 250-A. B) CCAP 211/7C ONT read coverage for 20-kb windows across the phased diploid genome assembly of UTEX 250-A. C) UTEX 25 PacBio read coverage and strain “0710” Illumina read coverage for 20-kb windows across a representative haplotype of the UTEX 25 genome assembly.

We next checked for possible aneuploidies in A. protothecoides UTEX 25 and UTEX 2341, and A. symbiontica CCAP 211/61, by mapping PacBio reads for each strain against one representative haplotype from their respective genome assembly. For UTEX 2341 and CCAP 211/61, coverage estimates were consistent across all chromosomes, suggesting euploidy (Supplementary Fig. S8). However, median read coverage for chromosome 1 (76×) of UTEX 25 was approximately 50% higher than median coverage across the rest of the genome (54x), implying trisomy (Fig. 4C). We also noted that the existing A. protothecoides “0710” genome assembly exhibited almost no variation relative to our UTEX 25 assembly, indicating that they are replicates of the same strain. Surprisingly, when mapping the “0710” Illumina genomic reads to our UTEX 25 assembly, we found no increase in coverage for chromosome 1, whereas coverage for chromosomes 7 and 10 (131 and 134×, respectively) was ∼50% higher than the median of 89 × across the rest of the genome (Fig. 4C). Thus, it appears that the evolution of aneuploidy is dynamic, occurring on short timescales of laboratory culture, at least in the UTEX 250 and UTEX 25 genetic backgrounds. These results are in line with chromosome duplication events observed in several haploid green algal species during mutation accumulation experiments (Krasovec et al. 2023; Lopez-Cortegano et al. 2023). Transient aneuploidy may explain the rapid emergence of whole-chromosome LOH, suggesting that it is a prominent evolutionary force in Auxenochlorella. It is unclear whether any of the observed aneuploidies confer a selective advantage; at least one additional copy of B03 has been maintained over decades of culture in UTEX 250 and CCAP 211/7C, potentially indicating that it could be beneficial. There are 149 genes on B03, and analyses focusing on the effects of dosage on gene and protein expression will be required to address this.

Gene family reduction and loss contribute to the streamlined Auxenochlorella genome

To facilitate the development of Auxenochlorella as a model system, we aimed to produce a reference quality structural annotation for the UTEX 250-A genome. We sequenced PacBio cDNA (IsoSeq) libraries from UTEX 250-A grown either autotrophically or with glucose, and multiple Illumina RNAseq libraries from different growth conditions. Utilizing these datasets as evidence, we annotated gene models independently for the A and B haplotypes using a combination of gene predictors and extensive manual curation (see Methods). This yielded 7,509 and 7,515 protein-coding genes on the A and B haplotypes, respectively, with an additional 86 and 121 genes carried by transposable elements included in the annotation (Table 1). More than two-thirds of the gene models are derived from multiple full-length IsoSeq reads, enabling high confidence identification of transcription start sites (TSS), terminators, and alternative splice variants. Approximately 43% of genes are annotated with at least one alternative transcript, totaling 13,125 and 13,345 transcripts on the A and B haplotypes, respectively. Considering the A haplotype, the majority (56%) of alternative isoforms differ only in their UTR sequences (e.g. due to alternative TSSs or terminators) relative to another isoform of the same gene. For alternative isoforms that alter the predicted protein sequence, 31% feature a retained intron and 18% an exon with an alternate 5′ or 3′ splice junction. The remaining 51% were classified as complex isoforms, many of which are predicted to originate from internal promoters that produce truncated proteins (see below). Alternative transcripts that skip exons are rare (<1%). Finally, 612 gene models were corrected based on manual curation, including 43 bicistronic loci analyzed by Dueñas et al. (2025a), which are frequently misannotated by automated approaches (Gallaher et al. 2021). Speaking to the compactness of the UTEX 250-A genome, 74% of genes encode a protein with a Pfam domain (Supplementary Data Set 6), a metric that is typically 25% to 50% in algal genomes (Blaby-Haas and Merchant 2019). Via orthology to C. reinhardtii proteins, one-third of genes were also assigned formal gene symbols (Supplementary Data Set 7), which generally correspond to at least some level of functional curation (Blaby et al. 2014; Craig et al. 2023).

Via synteny analysis, we were able to match ∼98% of genes as allelic pairs between the A and B haplotypes. Genes were given a unique ID featuring the haplotype and chromosome (e.g. A11, B08) and a five-digit identifier that is shared between allelic pairs, e.g. despite being located on rearranged chromosomes, UTEX250_A11.30955 and UTEX250_B08.30955 refer to alleles of the same gene. As expected, heterozygosity between the A and B alleles at 4-fold degenerate sites in the coding sequence is substantially higher (3.44%) than at 0-fold degenerate sites (0.84%), although this nonetheless corresponds to more than 35,000 allelic variants that result in amino acid differences. Heterozygosity is highest in intergenic (3.51%) and intronic (4.02%) sites (Supplementary Data Set 8). Only ∼15% of the haplotype-specific genes encode a predicted protein with a Pfam domain (Supplementary Data Set 6), and many of these genes may be evolutionarily young or misannotated long noncoding RNA (lncRNA) genes that carry a spurious open reading frame (ORF). Some haplotype-specific genes are the result of structural variation between the A and B haplotypes, including a few copy number and presence/absence variants.

Although we identified over 500 additional genes in our UTEX 250-A genome relative to the existing A. protothecoides 0710 assembly, the Auxenochlorella gene number remains lower than that of most trebouxiophyte species, and substantially lower than that of chlorophyceaen species such as C. reinhardtii (Table 1). We performed an orthology analysis of UTEX 250-A proteins against the best quality chlorophyte annotations, arbitrarily using the A haplotype genes supplemented with genes unique to the B haplotype. Figure 1A presents a phylogeny based on 669 single-copy orthologs identified by this analysis, and Fig. 5A shows the intersecting sets of orthogroups among Chlorellales species, with C. subellipsoidea and C. reinhardtii as outgroups. Ignoring species-specific orthogroups, the largest set corresponds to gene families present in all species (2,974 orthogroups), many of which are presumably essential core-Chlorophyte gene families (Supplementary Data Set 9). The second largest set features gene families specifically absent in the nonphotosynthetic P. cutis (520 orthogroups), most of which are expected to have functions related to photosynthesis (Suzuki et al. 2018). With respect to reduced gene content in Auxenochlorella, 297 orthogroups are absent across all Chlorellales species, and 201 orthogroups are specifically absent in Auxenochlorella UTEX 250-A and P. cutis but present in all other species.

Figure 5.

Figure 5.

Gene family loss and reduction in Auxenochlorella UTEX 250-A. A) Upset plot showing intersection between the presence and absence of orthogroups among Trebouxiophyte algae, with C. reinhardtii as an outgroup. Only intersects with more than 100 orthogroups are shown. B) Number of genes per orthogroup for orthogroups that are present in each of Auxenochlorella UTEX 250-A, C. vulgaris and C. reinhardtii, and have more than one gene in at least one of the species. C) Introns per gene for multi-exonic genes. D) Intron length distributions. E) Presence of GreenCut2 genes in UTEX 250-A and C. vulgaris relative to C. reinhardtii. Light-harvesting complex genes of the photosystem I antenna (LHCA) and genes functioning in nonphotochemical quenching (NPQ) are highlighted; PSBS and LHCSR are present as 2 and 3 paralogous genes in C. reinhardtii and single orthologs in C. vulgaris (see Supplementary Data Set 13). The CGL, CPL, CGLD, and CPLD divisions are defined in the main text. F) Presence of CiliaCut, meiosis, and syngamy genes in UTEX 250-A and C. vulgaris relative to C. reinhardtii.

For more detailed analyses, we compared the Auxenochlorella UTEX 250-A annotation to the high-quality gene models of Chlorella vulgaris (Cecchin et al. 2019) and C. reinhardtii (Craig et al. 2023), excluding any genes carried by transposable elements. Considering gene family loss, there are 782 orthogroups that are absent from UTEX 250-A but present in both other species, featuring 1,025 C. vulgaris genes and 1,212 C. reinhardtii genes. Considering gene family size, 1,341 orthogroups are present in all three species and have more than one gene in at least one species, featuring 2,228 UTEX 250-A genes, 2,915 C. vulgaris genes, and 3,272 C. reinhardtii genes. Approximately 56% of these orthogroups feature a single UTEX 250-A gene, relative to 35% and 29% for C. vulgaris and C. reinhardtii, respectively (Fig. 5B). UTEX 250-A genes also contain fewer and shorter introns (mean 5.1 introns per gene, mean length 179 bp) relative to C. vulgaris (means 7.3 and 206 bp) and C. reinhardtii (means 7.8 and 272 bp) (Fig. 5, C and D). Thus, the streamlined Auxenochlorella UTEX 250-A genome can be explained by both complete loss or lower complexity of gene families, alongside lower repeat content (Table 1) and the presence of fewer and shorter introns.

Many of the lost genes can be linked to known metabolic pathways. Two distinguishing features of Auxenochlorella are an absolute requirement for thiamine and the inability to grow on nitrate as a N source (Kalina and Punčochárová 1987). As expected, the nitrate reductase gene NIA1 and THIC1, encoding 4-amino-5-hydroxymethyl-2-methylpyrimidine phosphate synthase necessary for thiamine biosynthesis, are among the genes present in C. vulgaris and C. reinhardtii but absent in UTEX 250-A. All other components of the thiamine biosynthesis pathway are present in UTEX 250-A, enabling thiamine production via salvage of breakdown products. Similarly, as reported by Gao et al. (2014), Auxenochlorella cannot utilize urea as an N source and genes responsible for urea assimilation (DUR1, DUR2) and transport (DUR3) have been lost. Several other transporters present in at least one copy in both C. vulgaris and C. reinhardtii are also absent in UTEX 250-A, including the borate transporter BOR1 and the SLT-family sodium/sulfate co-transporters. Other transporter gene families feature a single copy in UTEX 250-A, including the CTR-type copper ion transporters (3 genes in each of C. vulgaris and C. reinhardtii) and the PTB-family sodium/phosphate symporters (4 genes in C. vulgaris, at least 9 in C. reinhardtii).

The reduction in gene complexity can also be seen by BUSCO (Benchmarking Universal Single Copy Orthologs) (Manni et al. 2021) analysis against the UTEX 250-A predicted proteins. Of the 1,519 chlorophyte BUSCO genes, 1,488 (98%) were identified as complete, with only 9 (0.6%) complete and duplicated. One protein was called as fragmented, although it was manually confirmed to be a correct gene model. Of the 30 missing BUSCOs, eight could be manually recovered by cross-checking against the orthogroup analysis. The remaining 22 genes were also undetected when searching BUSCOs against the UTEX 250-A genome, suggesting they are biologically absent and not misannotated (Supplementary Data Set 10). Most notably, 9 of the 22 missing BUSCOs are part of GreenCut2, a set of almost 600 genes conserved among select photosynthetic eukaryotes and absent from nonphotosynthetic species (Karpowicz et al. 2011). A specific analysis of genes involved in photosynthesis and sexual reproduction is presented below.

We also analyzed a small number of gene families that have specifically expanded in UTEX 250-A. Import of glucose and galactose in Chlorella kessleri is facilitated by 3 genes encoding H+/Hexose symporters (HUP1, HUP2, and HUP3) (Stadler et al. 1995), and 3 paralogous genes that are most similar to C. kessleri HUP2 were identified in the A. protothecoides 0710 genome (Gao et al. 2014). We identified 5 HUP genes on both haplotypes of the UTEX 250-A genome, including a tandem array of 3 genes on chromosome A12/B12 that is disrupted by contig breaks in the 0710 assembly that resulted in misannotation. For orthogroups featuring genes from UTEX 250-A, C. vulgaris and C. reinhardtii (Fig. 5B), we identified 13 cases involving UTEX 250-A-specific expansion via multiple gene duplications (Supplementary Data Set 11). A tandem array of 4 genes on A07/B12 encodes predicted ribonuclease (RNAse) T2 proteins, which have diverse functions including roles in stress response and pathogen defense (MacIntosh 2011), although the function of the single RNAse T2 gene from C. reinhardtii is unknown. A tandem array of 6 (A01) or 5 (B01) genes encodes homologs of C. reinhardtii ferrireductase FRE1, which reduces Fe(III) to Fe(II) for Fe assimilation (Allen et al. 2007). Four of the expanded orthogroups are associated with glycosylation. Three of these correspond to gene families (HPAT, RRA, and xyloglucanase-113) that are specifically involved in arabinose O-glycosylation of the Hyp-rich glycoproteins (HRGPs) that form the vegetative and zygote cell walls in C. reinhardtii (Joo et al. 2017). With respect to proteins potentially involved in cell wall formation, it is also notable that UTEX 250-A carries multiple genes encoding polyketide synthases, as reported previously for A. protothecoides 0710 (He et al. 2016; Heimerl et al. 2018). These include homologs of PKS1 from C. reinhardtii, encoding a type I polyketide synthase involved in the formation of the zygote cell wall (Heimerl et al. 2018), and LAP5 from Arabidopsis thaliana, encoding a type III polyketide synthase involved in the formation of sporopollenin (Kim et al. 2011; He et al. 2016). Further functional characterization of these expanded gene families may provide insights into the unique biology of Auxenochlorella species.

Of the 146 and 655 genes that are uniquely present in the AHP lineage or only in UTEX 250-A, respectively, only 10% encode predicted proteins with an identifiable domain. Most notably, 61 haplotype A genes are present in five AHP or Auxenochlorella-specific orthogroups that are associated with pentapeptide repeat domains, and a further 39 genes are present in a single similarly associated orthogroup that otherwise features only one gene in each of the Chlorellales species (including P. cutis). Genes encoding pentapeptide repeat proteins are abundant in many bacterial genomes, particularly those of some cyanobacteria, although their functions remain elusive (Vetting et al. 2006). Two highly conserved GreenCut2 proteins feature a pentapeptide repeat domain of unknown function, CPLD59 and CPLD44, and are localized to the thylakoid lumen (Kieselbach et al. 1998; Schubert et al. 2002). Some mycobacterial pentapeptide repeat proteins adopt a fold that mimics double-stranded DNA and can compete with DNA in DNA-protein interactions (Feng et al. 2021). Although their functions are unclear, these 100 genes are dispersed throughout the UTEX 250-A genome, and their repetitive nature can apparently mediate genomic rearrangements via ectopic recombination (Fig. 2, B and C).

Auxenochlorella encodes a reduced set of proteins related to photosynthesis and sexual reproduction

The GreenCut

Intrigued by the potential loss of core photosynthesis genes, we next queried the UTEX 250-A and C. vulgaris annotated proteins against the entire GreenCut2 dataset, using C. reinhardtii proteins as a reference set. GreenCut2 is divided into four categories: genes conserved among green algae and land plants (CGL, i.e. the green lineage), the green lineage and the red alga C. merolae (CPL), the green lineage and diatoms (CGLD), and all of these groups (CPLD). While 12.5% of the GreenCut2 genes were not found in C. vulgaris, 18.0% were absent in UTEX 250-A, corresponding to the specific loss of 33 photosynthesis-related genes despite retaining photosynthesis (Fig. 5E; Supplementary Data Set 12). For example, UTEX 250-A encodes a reduced repertoire of light-harvesting complex proteins. Three of the LHCA genes that encode the antenna system of photosystem I are absent, namely LHCA4, LHCA5, and LHCA6, but are present in all analyzed trebouxiophytes, including C. vulgaris (Fig. 5E; Supplementary Data Set 13). This suggests that the PSI-LHCI supercomplex of Auxenochlorella likely consists of a single LHCI tetramer (with subunits LHCA1, LHCA8, LHCA7, and LHCA3) and one LHCI dimer (LHCA2 and LHCA9), as in Dunaliella species (class Chlorophyceae) (Caspy et al. 2020; Liu et al. 2025). The PSBS and LHCSR genes, encoding light-harvesting complex-like subunits involved in nonphotochemical quenching (Allorent et al. 2016), are also absent in Auxenochlorella UTEX 250-A but present in C. vulgaris (Cecchin et al. 2019), raising the question of how Auxenochlorella dissipates excess excitation energy.

The absence of certain GreenCut2 genes in Auxenochlorella demonstrates that they are not essential for photosynthesis despite their deep conservation in photosynthetic organisms. Nevertheless, we note that with comparison restricted to C. reinhardtii, some false negatives are expected. Small, divergent proteins with organellar targeting peptides would be especially susceptible to exclusion in orthology analysis. One example is RBCX (Supplementary Data Set 12), a protein chaperone that is essential for proper assembly of the RuBisCo large subunit into the octameric RbcL8 intermediate, and subsequent association with RuBisCo small subunit to form the holoenzyme (Saschenbrecker et al. 2007; Liu et al. 2010). Eukaryotic RBCX proteins are found in two distinct isoforms; RBCX-I is most similar to cyanobacterial RbcX, and RBCX-II is more divergent (Saschenbrecker et al. 2007; Bracher et al. 2015). While A. thaliana has both forms, the C. reinhardtii RBCX2A and RBCX2B genes both encode RBCX-II proteins (Kolesiński et al. 2011; Bracher et al. 2015). We identified an ortholog of the C. reinhardtii RBCX genes in C. vulgaris, but not in UTEX 250-A, despite RBCX chaperone function likely being indispensable. Manual searches using the RBCX-I proteins from A. thaliana and Synechocystis sp. PCC 6803 did reveal a candidate protein (UTEX250_A12.36490, Supplementary Fig. S9A) featuring an RbcX-like domain and a 56 amino acid N-terminus plastid transit peptide predicted by TargetP-2.0 (Almagro Armenteros et al. 2019). Interestingly, C. vulgaris also encodes a highly similar protein, suggesting that both RBCX-I and RBCX-II isoforms are present in the species (as in A. thaliana,  Supplementary Fig. S9B), whereas UTEX 250-A has only RBCX-I and C. reinhardtii only RBCX-II. Comparisons of monomer AlphaFold structure models of the putative RBCX-I proteins from UTEX 250-A and C. vulgaris with the corresponding crystal structures from Synechocystis sp. PCC 6803 (Tanaka et al. 2007) and A. thaliana (Kolesiński et al. 2013) confirmed structural homology, supporting the contention that they perform the same RbcL chaperone function (Supplementary Fig. S9C).

The CiliaCut and meiosis genes

The putative hybrid origin of UTEX 250 suggests the presence of a cryptic life cycle stage in Auxenochlorella species, which may include meiosis of otherwise vegetatively diploid cells and the formation of gametes (or at least implies an asexual transition between diploid and haploid cells). Direct evidence for sexual reproduction in the Trebouxiophyceae is scarce, although Fučiková et al. (2015) identified a core set of 9 meiosis genes that are mostly present in trebouxiophyte genomes. Although Chlorella species have never been observed to form ciliated gametes, Blanc et al. (2010) identified orthologs of C. reinhardtii “CiliaCut” genes, which are conserved among ciliated organisms but absent from species without cilia (Merchant et al. 2007). Specifically, C. variabilis has genes encoding the outer and inner dynein arm proteins, but lacks most or all of the genes encoding components of the intraflagellar transport particle, radial spoke, and central pair complex (Blanc et al. 2010). The cilia genes encoded by C. variabilis significantly overlap with those present in the centric diatom Thalassiosira pseudonana (i.e. the “CentricCut”), which produces male gametes with motile cilia (Moore et al. 2017). Cecchin et al. (2019) confirmed this result, finding an even more substantial intersect between “MotileCut” cilia genes common to C. vulgaris and T. pseudonana, but absent in Caenorhabditis elegans (which produces sensory cilia, but not motile cilia), suggesting that Chlorella species may carry the necessary repertoire of genes to produce motile ciliated gametes. Most recently, Gazquez et al. (2024) demonstrated that the ciliated life stage of the lichen-forming Trebouxia lynniae corresponds to gametes, providing one of the few characterizations of a sexual lifecycle in a trebouxiophyte alga.

Only 25.3% of CiliaCut genes from C. reinhardtii are present in the UTEX 250-A genome, relative to 38.5% in C. vulgaris (Fig. 5F). However, 58.3% of the genes that form the intersect of the MotileCut and CentricCut are present in UTEX 250-A, relative to 66.7% in C. vulgaris, corresponding to only 4 genes that are uniquely absent in UTEX 250-A (Supplementary Data Set 14). Fučiková et al. (2015) identified 4 of the 9 core meiosis genes in the A. protothecoides 0710 genome, with HOPP1, REC8, MER3, MSH4, and MSH5 absent. We identified HOPP1 and REC8 in UTEX 250-A (Fig. 5F), suggesting that they were previously missed due to misannotation. MSH4 and MSH5, homologs of the bacterial mismatch repair MutS family, are absent in some sexual species, such as Plasmodium vivax, although MER3, encoding a helicase involved in the formation of meiotic crossovers (Altmannova et al. 2023), is universally present in sexual species (Fučiková et al. 2015). We also identified UTEX 250-A orthologs of two critical genes for syngamy, GEX1, which functions in nuclear fusion (Ning et al. 2013), and HAP2, which functions in gamete fusion (Fedry et al. 2017). Thus, Auxenochlorella species may be able to reproduce sexually, possibly via the formation of ciliated gametes, although, as with photosynthesis, the set of genes functioning in these processes has been substantially reduced. Notably, our analyses confirm the prior observation that all trebouxiophytes have lost the KNOX/BELL homeodomain transcription factors (Joo et al. 2018), typified by the C. reinhardtii genes GSP1 and GSM1, that regulate haploid-to-diploid transitions in Archaeplastida (Lee et al. 2008; Hisanaga et al. 2021; Hirooka et al. 2022). Thus, the molecular pathways underlying putative sexual lifecycles in both haplontic and diplontic trebouxiophytes remain cryptic.

Periodic cytosine and adenine methylation supports widespread internal antisense transcription

DNA methylation

The DNA modification 5-methylcytosine (5mC) is widespread across all domains of life and has several essential functions (Mattei et al. 2022), although relatively little is understood about cytosine methylation in green algae. In C. variabilis, gene bodies are extensively methylated at CpG dinucleotides, whereas promoter regions are hypomethylated (Zemach et al. 2010). Although 5mC is generally restricted to centromeres and subtelomeres in C. reinhardtii (Lopez et al. 2015; Chaux-Jukic et al. 2021; Craig et al. 2023), extensive gene body methylation was recently shown for the chlorophyceaen alga T. obliquus (Biondi et al. 2024), suggesting that this pattern may be widespread in chlorophytes. In the extremely compact genomes of prasinophytes such as Ostreococcus lucimarinus and Micromonas pusilla, CpG dinucleotides in gene bodies are also highly methylated, although 5mC occurs in periodic clusters that correspond to the linker DNA between nucleosomes (Huff and Zilberman 2014). Notably, periodic adenine methylation (6mA), mostly at ApT dinucleotides, was also discovered in C. reinhardtii (Fu et al. 2015). Periodicity of 6mA is also due to specific methylation of nucleosome linkers, although unlike for 5mC, the adenine methylation is present around TSSs and not in the gene body. Romero Charria et al. (2024) demonstrated that periodic adenine methylation of linker DNA downstream of TSSs is also present in both prasinophytes and C. variabilis, as well as in other distantly related taxa, suggesting that it is an ancient feature of eukaryotic genomes. The function of 6mA in this context is not known; the biophysical properties of this modification may play some role in promoting transcription (Bochtler and Fernandes 2021), and 6mA could also facilitate the coordination of promoter chromatin marks such as H3K4me3 (Romero Charria et al. 2024).

We used the ONT reads from CCAP 211/7C (equivalent to UTEX 250, Fig. 3A) to call both 5mC at CpG sites and 6mA at ApT sites across the UTEX 250-A genome. Using only the high-confidence TSSs from IsoSeq-based gene models, periodicity of both modifications was evident (Fig. 6A). Cytosine methylation is essentially absent at sites adjacent to the majority of TSSs, and universal at the hypermethylated peaks within gene bodies (on average 5mC is called at ∼95% of CpG sites at the highest point of the peaks). Conversely, two periodic peaks of 6mA are present downstream of the TSS, although far fewer ApT sites are methylated and the signal is noisier between the peaks (Supplementary Fig. S10). Assuming that the peaks correspond to nucleosome linkers, it appears that the first two linkers downstream of the TSS are specifically associated with adenine methylation, whereas cytosine methylation is present at the third linker and reaches a stable maximum at subsequent linkers. These patterns closely resemble those of the prasinophytes, as opposed to C. variabilis where 6mA exhibits periodicity but 5mC is relatively constant across the gene body (Romero Charria et al. 2024). Huff and Zilberman (2014) proposed that 5mC periodicity contributes to nucleosome positioning and hypothesized that this could be an adaptation to extremely compact nuclei and small cell sizes. Our results suggest this unusual genomic architecture may have been present in the common ancestor of the core-Chlorophyta (Trebouxiophyceae and Chlorophyceae in Fig. 1A), or alternatively that it re-evolved on the lineage leading to Auxenochlorella.

Figure 6.

Figure 6.

Promoter methylation and antisense lncRNAs in Auxenochlorella UTEX 250-A. A) Median per site 5mC (red) and 6mA (grey) around transcription start sites. B) IGV screenshot showing example of internal bidirectional promoter producing antisense lncRNA and truncated isoform for FAP57 (UTEX_A08.19920). The top row shows the B haplotype of the UTEX 250-A genome mapped against the A haplotype, with colored bands corresponding to single-nucleotide polymorphisms. The second row shows read coverage of ONT reads from CCAP 211/7C, where only CpG sites are shown. The proportion of red to blue corresponds to the proportion of 5mC calls in the ONT reads. The third row shows IsoSeq data, where pink reads are on the forward strand, and blue reads the reverse. The final row shows the annotated gene models, where the thin lines correspond to introns, intermediate lines to UTRs, and thick lines to coding sequence. C) GO term enrichment analysis for 780 genes overlapped by antisense lncRNAs. See Supplementary Data Set 16 for full results.

Antisense long noncoding RNAs

Aside from the biological importance of periodic methylation, these signatures also provide evidence for promoter regions, similar to H3K4me3 ChIP-seq datasets that only exist for a limited number of green algae (Strenkert et al. 2022; Petroll et al. 2025). Figure 6B shows a representative locus featuring four genes where CpG hypomethylation clearly corresponds to the regions downstream of the IsoSeq-based TSSs. Notably, one of these genes (UTEX250_A08.19925) is a lncRNA transcribed from a bidirectional hypomethylated promoter located within the CiliaCut gene FAP57 (UTEX250_A08.19920), which is essential for the correct assembly of inner dynein arms (Lin et al. 2019). While manually curating genes in the UTEX 250-A genome, we noticed many similar cases, which motivated a thorough search for lncRNA genes that are supported by multiple IsoSeq reads (Table 1). After manual verification, 165 protein-coding genes were identified with internal bidirectional promoters that produced antisense lncRNAs on at least one of the two haplotypes (Supplementary Data Set 15). For genes such as FAP57, it is unclear whether the truncated isoforms transcribed from the internal promoter could be functional, while for many others, the truncated isoforms clearly have no coding capacity. We also identified 615 protein-coding genes that exhibit substantial overlap with an antisense lncRNA (>30% of the transcript length) transcribed from an internal, or immediately adjacent, unidirectional promoter. The annotated lncRNAs are expected to be polyadenylated based on the IsoSeq protocol and ∼59% are spliced.

To explore the potential functions of widespread antisense transcription in UTEX 250-A, we analyzed the predicted functions of the genes with associated lncRNAs. Manual curation of proteins encoded by genes with internal bidirectional promoters recovered potential functions associated with DNA, such as repair, chromatin, and replication, followed by functions related to post-translational regulation, cilia, and the cell wall. A gene ontology (GO) enrichment analysis focusing on the full set of 780 genes with evidence for antisense transcription supported these observations. The most significant terms for biological processes were all related to DNA repair and recombination, including the repair of double-strand breaks by homologous recombination (Fig. 6C, Supplementary Data Set 16). Indeed, the list includes many fundamental DNA repair genes, including RAD51, RAD21, the RECQ helicases RECQ1 and RECQ5, and the DNA polymerases POLK1 and POLQ2 (Supplementary Data Set 15). For cellular compartments, the two most significant terms, “condensed nuclear chromosome” and “cohesin complex,” are both related to mitosis, meiosis, and recombination, whereas the next two significant terms are related to cilia. The most significant molecular function term, “calcium ion binding” (Supplementary Data Set 16), is also associated with cilia genes. Indeed, 14 of the CiliaCut genes (including FAP57) exhibit antisense transcription, alongside 5 of the 8 UTEX 250-A core meiotic and syngamy genes (Fig. 5F), representing a significant enrichment (χ2= 32.05, P = 1.5 × 10−8). The nuclear fusion gene GEX1 represents an extreme case where all of the IsoSeq reads at the locus are antisense (Supplementary Fig. S11).

Based on these results, we speculate that antisense lncRNAs may play a regulatory role in Auxenochlorella. In synchronized cultures of C. reinhardtii, some DNA repair genes are specifically upregulated at the transition from dark to light (e.g. the ribonucleotide reductase subunit RIR2L, which features an antisense lncRNA in UTEX 250-A), suggesting a possible role in photodamage response (Strenkert et al. 2019). Many human DNA repair genes are also upregulated under genotoxic stresses such as UV radiation and ROS (Christmann and Kaina 2013). In C. reinhardtii, several of the core meiotic genes are among a small set of genes that lack H3K4me3 at their TSSs during vegetative growth, supporting expression restricted to the sexual cycle (Strenkert et al. 2022). It therefore seems plausible that genes overlapped by antisense lncRNAs must be specifically translated at certain points of the cell and life cycles, or under specific conditions. While this gene set is enriched for putative functions in DNA repair and sex, we also observe this phenomenon for many genes that are predicted to have unrelated functions, suggesting that this may be a general regulatory mechanism. Antisense lncRNAs can regulate expression of associated protein-coding genes by several mechanisms (Werner et al. 2024). For example, in compact genomes such as S. cerevisiae, transcriptional interference can occur via the physical collision of polymerase complexes on opposite strands (Prescott and Proudfoot 2002; Hobson et al. 2012). Experimental characterization, including differential expression analyses of the protein-coding and lncRNA gene pairs at these loci, will be required to test this hypothesis. However, we are not aware of similar observations in other green algae, suggesting widespread antisense transcription may have emerged as an unusual gene regulatory mechanism during Auxenochlorella evolution. One notable observation is that the UTEX 250-A genome encodes only 15 predicted F-box domain proteins, relative to 43 in C. vulgaris and 38 in C. reinhardtii (based on the Pfam domains PF12937 and PF00646). F-box proteins function in post-translational regulation as a component of the SCF (Skp, Cullin, F-box containing) ubiquitin-ligase complex, with the F-box protein responsible for binding specific substrates destined for proteolysis (Kipreos and Pagano 2000). Although currently speculative, it is possible that transcriptional regulation of at least a subset of fundamental genes by lncRNAs emerged in the compact Auxenochlorella genome, in contrast to a decrease in regulation via protein degradation.

An Auxenochlorella genetic toolkit: reverse genetics, selectable markers, inducible promoters, and fluorescent reporters for cellular localization

In combination with the genomic resources presented thus far, nuclear gene targeting by homologous recombination in Auxenochlorella facilitates reverse genetic analysis of biochemical pathways and processes. To illustrate this capability, we targeted CHL27, encoding the chlorophyll (Chl) biosynthesis enzyme Mg-protoporphyrin IX monomethylester (MgPMME) aerobic cyclase, reasoning that loss of Chl would provide a clear visual phenotype, and that nonphotosynthetic mutants could be maintained heterotrophically on glucose in the dark. The UTEX 250-A CHL27 locus is heterozygous and exemplifies the hybrid origin of the strain. The CHL27-1 allele on chromosome B12 is closely related to the corresponding heterozygous CHL27 alleles in UTEX 25, and the homozygous CHL27 locus in UTEX 2341; conversely, CHL27-2 on A07 has more polymorphisms in common with the heterozygous CCAP 211/61 CHL27 alleles (Fig. 7A, Supplementary Fig. S12A). CHL27 polymorphisms cluster in the introns and in the flanking intergenic region between the CHL27 3′ UTR and the downstream gene, and all but one of the coding sequence polymorphisms are synonymous, so 6 of the 7 alleles encode identical polypeptides (Supplementary Fig. S12, A and B). A nonsynonymous polymorphism in UTEX 2341 CHL27 substitutes Arg for Gly at position 33, but this change is in the region predicted to encode the plastid transit peptide, so all of the alleles are expected to produce identical mature polypeptides after chloroplast import and transit peptide cleavage (Supplementary Fig. S12B).

Figure 7.

Figure 7.

Auxenochlorella reverse genetics and genetic toolkit. A) Neighbor-joining phylogeny of CHL27 alleles in UTEX 25, 250-A, 2341, and CCAP 211/61. Sequence comparisons encompassed the CHL27 alleles and the 5′ and 3′ flanking regions used to make targeting constructs (see Supplementary Fig. S12A). All bootstrap values >70%. B) Selection and photosynthetic growth phenotypes of representative thiamine auxotrophic and G418-resistant transformants versus the corresponding wild-type controls. Cultures were grown for 5 d at 26 °C with 140 rpm shaking in the dark, or for 10 d at 24 °C with approximately 40 μmol.m−2.s−1 of white light provided by cool white fluorescent bulbs. All strains grew in medium supplemented with 2 μM thiamine (+T), but only transformants expressing THIC grew in medium without added thiamine (−T). Resistance to 100 μg/mL of G418 (+T, +G418) was conferred by nptII. chl27 double knockout strains grew under both selection conditions (−T, +G418). Wild-type and chl27 heterozygotes grew photoautotrophically, but chl27 homozygous mutants were nonphotosynthetic (minimal, +T). C) Pigment accumulation under different trophic conditions in 25 mL flask cultures of UTEX 250-derived chl27 heterozygous and homozygous mutants versus UTEX 250-A. Cultures were grown with 2 μM thiamine and 20 g/L (2% G) or 5 g/L (5% G) of glucose for 3 d at 28 °C, with 200 rpm shaking, in the dark or with approximately 5 μmol.m−2.s−1 of white light provided by warm white LEDs. D) Absorption spectra of extracts from UTEX 250-A, chl27 heterozygous and homozygous mutant strains. Cultures were grown for 4 d in the dark with 20 g/L glucose at 24 °C, 160 rpm in 24-well plates. Shoulders at 421 nm and peaks around 450 and 475 nm are consistent with absorption by lutein and zeaxanthin, and a minor peak at 663 nm reflects a small amount of chlorophyll in heterotrophic wild-type and chl27 heterozygotes. Peaks at 418, 551, and 590 nm in extracts from chl27 double knockouts are indicative of absorption by Mg-protoporphyrin IX or Mg-PMME. E) Selectable marker growth phenotypes of transformants compared to UTEX 250-A. Strains were cultured for 4 d in the dark at 26 °C with 140 rpm shaking, in 20 g/L sucrose (2% S), 20 g/L melibiose (2% M), 300 μg/mL hygromycin B (hyg), or for 5 d with approximately 40 μmol.m−2.s−1 of white light provided by cool white fluorescent bulbs, at 24 °C with 140 rpm shaking, in tris-acetate medium containing 1.1 mm phosphite (Phi). SUC2 was targeted to the DAO1 locus, and driven by the TUB2 promoter from C. reinhardtii (pCrTUB2); MEL1 was integrated at DAO1 and driven by the endogenous UBQ1 promoter (pUBQ1); aph7″ was controlled by the promoter of HSP90A (pHSP90A) and targeted to AMT1B-1; and ptxD, under the control of the PTB1 promoter (pPTB1), was integrated at THI4. F) Fluorescence imaging of UTEX 250-A transformants expressing sucrose invertase and GFP. Cells were from cultures grown with 5 g/L of sucrose for 3 to 4 d at 25 °C, shaking at 160 rpm, and approximately 40 μmol.m−2.s−1 of white light provided by cool white fluorescent bulbs.

We designed UTEX 250 CHL27-1 and CHL27-2 targeting constructs to fully eliminate the coding sequence of each allele (Supplementary Fig. S12, C and D). Since thiamine auxotrophy is a characteristic of all Auxenochlorella and Prototheca species (Pore 1972), we used THIC from A. thaliana (Franklin et al. 2013; Moseley et al. 2024) as the marker for transformation targeting CHL27-1. Neomycin phosphotransferase II (nptII), conferring resistance to G418 (Geneticin) (Franklin et al. 2011; Moseley et al. 2024; Dueñas et al. 2025a), was the selectable marker for targeting CHL27-2. We did not make allele-specific targeting constructs for UTEX 25, UTEX 2341 or CCAP 211/61, reasoning that the UTEX 250 CHL27 homology arms should have sufficient sequence identity to enable recombination (Supplementary Fig. S12A). Representative chl27-1::THIC transformants from UTEX 25, UTEX 250, and UTEX 2341 are thiamine prototrophs, while chl27-2::nptII integration into UTEX 250 and CCAP 211/61 conferred resistance to G418 (Fig. 7B). Integration of the constructs at the CHL27 loci was confirmed by PCR amplification of genomic DNA using primers flanking the homology arms (Supplementary Fig. S12E), and subsequent sequencing of the PCR products. CHL27-1 was disrupted in 8/11 thiamine prototrophic UTEX 250 transformants and 3/11 had the integrating construct mis-targeted to the CHL27-2 allele, while integration at the CHL27-2 locus was confirmed in 9/9 G418-resistant UTEX 250 transformants. Genetic tractability appears to be a feature of the Auxenochlorella genus, as evidenced by successful transformation and targeted gene replacement by homologous recombination in both A. protothecoides and A. symbiontica strains, and allele-specific targeting in the UTEX 250 hybrid. Integration into the nuclear genome occurs via homologous recombination in almost all Auxenochlorella transformants, in contrast to 2% of transformants reported for Ostreococcus (Lozano et al. 2014). We have documented a step-by-step guide to the transformation of Auxenochlorella using a lithium acetate/polyethylene glycol method on protocols.io (Dueñas et al. 2025b).

Representative chl27 loss-of-function mutants (chl27-1::THIC/chl27-2::nptII) were generated by sequential targeting of the CHL27 alleles. Double knockout strains were both thiamine prototrophs and G418-resistant (Fig. 7B). Heterotrophic growth was robust for all strains, but chl27 mutants bleached and were incapable of photoautotrophic growth on minimal media (Fig. 7B). Photoautotrophic growth of single allele knockouts was equivalent to the wild-type parents, demonstrating that one copy of CHL27 was sufficient to maintain adequate Chl for photosynthesis under the growth conditions (Fig. 7B). In agreement with the observations of Shihira-Ishikawa and Hase (1964), wild-type UTEX 250-A and the chl27 heterozygous mutants grown under heterotrophic conditions with a high C/N ratio (2% glucose) contained minimal amounts of Chl, with the yellow color attributed to xanthophyll pigments (Fig. 7, C and D). Partial greening was observed in wild-type and single-allele knockout cultures grown with limiting glucose (0.5%) in the dark, and full greening was stimulated by just 5 μmol.m−2.s−1 of white light (Fig. 7C). In contrast, representative chl27 double knockout cultures accumulated a red pigment under the same growth conditions (Fig. 7C). Absorbance spectra from cell extracts of wild-type and single allele knockout strains grown in the dark with 2% glucose had minor Chl peaks at 663 nm, and were consistent with lutein and zeaxanthin as the major xanthophyll pigments (Fig. 7D). Chl was not detected in the chl27 double knockouts, which instead had a major peak at 418 nm with minor peaks at 551 and 590 nm (Fig. 7D), indicative of Mg-protoporphyrin IX or MgPMME accumulation (Tottey et al. 2003; Hollingshead et al. 2012). This is the expected result of blocking the aerobic cyclase that converts red MgPMME to green divinyl protochlorophyllide (Chen et al. 2021). The photosensitive phenotype of chl27 mutants (Fig. 7B) is likely a consequence of reactive oxygen species damage caused by the highly photoactive protoporphyrin pigments (Mock 2001).

To expand our capabilities for gene targeting and metabolic engineering, we developed additional selectable markers for transformation (Fig. 7E). Two additional selectable markers have been described in Prototheca moriformis: SUC2, encoding secreted sucrose invertase from S. cerevisiae, enables heterotrophic growth on sucrose (Franklin et al. 2011), whereas MEL1, encoding secreted α-galactosidase from Saccharomyces carlsbergensis, enables heterotrophic growth on melibiose (Franklin et al. 2013). We demonstrated that Auxenochlorella transformants expressing SUC2 or MEL1 consume sucrose or melibiose, respectively, for heterotrophic growth (Fig. 7E, Supplementary Fig. S13A). The secreted SUC2 and MEL1 glycosyl hydrolases act in a noncell-autonomous manner, making rigorous single-colony purification essential to avoid persistence of untransformed parental cells. Another aminoglycoside 3′-phosphotransferase antibiotic resistance gene, aph7”, from Streptomyces hygroscopicus, is commonly used as a C. reinhardtii transformation marker (Berthold et al. 2002), and confers resistance to 300 μg/mL hygromycin B to Auxenochlorella (Fig. 7E, Supplementary Fig. S13A). As was reported for C. reinhardtii (Loera-Quezada et al. 2016), phosphite oxidoreductase, encoded by ptxD from Pseudomonas stutzeri (Costas et al. 2001), enables mixotrophically grown Auxenochlorella to utilize phosphite as a P-source (Fig. 7E, Supplementary Fig. S13A). Together with nptII and THIC, described above, the four selectable markers illustrated in Fig. 7E can be used in serial transformations to generate homozygous knockout mutants at up to three loci, or to make heterozygous mutations at 6 independent loci.

We next set out to establish fluorescent protein expression in Auxenochlorella for intracellular localization studies. GFP and GFP fusions were driven by the RuBisCo small subunit (RBCS1) promoter and targeted to the neutral DAO1 locus (Dueñas et al. 2025a). Strains were cultured in the light under mixotrophic conditions such that transgene expression was activated in green cells. Chl and GFP fluorescence imaging by confocal microscopy showed that untargeted GFP was located in the cytoplasm (Fig. 7F, Supplementary Fig. S13B). Fusion of GFP to the mitochondrial targeting sequence of the OXA3 membrane insertase revealed a mitochondrial network encircling the cytoplasm (Fig. 7F, Supplementary Fig. S13B), while the chloroplast ribosomal protein L35 (PRPL35) plastid transit peptide targeted GFP to the chloroplast, as evidenced by colocalization of GFP fluorescence and Chl autofluorescence (Fig. 7F, Supplementary Fig. S13B). These strains may serve as references for the localization of cytoplasmic, mitochondrial, and plastid-targeted proteins in future research.

Next, we asked whether one of the inferred trisomies in UTEX 250-A (Fig. 4A) could be confirmed by allele-specific targeting. To do so, we took advantage of the characteristic metabolic switch of Auxenochlorella. The ammonium transporter gene AMT1B, which is highly upregulated under N-depletion in heterotrophic growth in UTEX 25 (Yan et al. 2013), is located on the right arms of chromosomes A03 and B03, with a putative third copy on the duplicate of B03 (termed “C03”, Fig. 8A). Since A03 and B03/C03 are heterozygous at this region, we designed an allele-specific construct with homology arms that specifically corresponded to the flanking sequences of the AMT1B allele of B03/C03 (termed AMT1B-1, Fig. 8A). The construct replaced the AMT1B-1 ORF with an ORF encoding the Venus fluorescent protein, which is under the control of the endogenous AMT1B-1 promoter. As above, nptII served as a transformation marker conferring resistance to G418. We designed allele-specific PCR primers that targeted the left and right flanks of the endogenous AMT1B alleles, in addition to primer pairs that targeted the construct, with one primer within the construct and the other in the sequence flanking the homology arms (Fig. 8A). Wild-type sequences for both AMT1B alleles (AMT1B-1 on B03 or C03, AMT1B-2 on A03) could be amplified from wild-type UTEX 250-A and from three representative transformants, but the transformants also contained mutant amt1B-1::Venus integrations, consistent with a trisomic state featuring one copy of AMT1B-2 and two copies of AMT1B-1 in the wild-type nuclear genome (Fig. 8B). We next compared Venus fluorescence between transformants and the UTEX 250-A wild-type negative control grown under photoautotrophic and heterotrophic conditions (Fig. 8, C and D). Since biomass concentration is higher in late log-phase heterotrophic cell cultures than in photoautotrophic cultures, the greater depletion of N from the medium leads to the activation of the N-deficiency-responsive AMT1B promoter. As a consequence, transformants displayed 19-fold higher Venus fluorescence in heterotrophy than in phototrophy, and on average 66-fold higher Venus fluorescence compared to the wild-type negative control (Fig. 8, C and D). We previously utilized the promoter of Photosystem I subunit D (PSAD1) to induce expression of a synthetic construct in phototrophy (Dueñas et al. 2025a). In addition to confirming trisomy, we here show that the AMT1B promoter represents a convenient counterpart for inducing expression in heterotrophy.

Figure 8.

Figure 8.

Allele-specific targeting of a trisomic chromosome. A) Schematic of chromosome 3 copies (A03 and B03/C03) with green triangles marking the location of AMT1B. The mutant locus features a gene cassette comprised of a Venus ORF and a 537 bp terminator sequence containing the endogenous 3′ UTR of MLDP1 (MAJOR LIPID DROPLET PROTEIN 1), and a gene cassette comprised of a nptII ORF flanked by the endogenous promoter/5′ UTR and terminator/3′ UTR of PGK1B (PHOSPHOGLYCERATE KINASE 1B), which is targeted to the AMT1B-1 allele of B03/C03. Venus expression is under the control of the endogenous AMT1B-1 promoter. B) PCR amplification of AMT1B-1 (B03/C03) and AMT1B-2 (A03) 5′ and 3′ flanking regions using allele-specific primers in wild-type UTEX 250-A and three representative amt1B-1::Venus transformants, with additional amplification of amt1B-1::Venus integration in the transformants. C) Venus fluorescence detection in late-log heterotrophic transformant cultures compared to wild-type UTEX 250-A negative control. D) Comparison of chlorophyll autofluorescence (red) and Venus fluorescence (cyan) in wild-type UTEX 250-A versus a representative transformant under heterotrophic or photoautotrophic conditions using confocal fluorescence microscopy.

Finally, given the effectiveness of homologous recombination in Auxenochlorella species, we assessed the UTEX 250-A genome for genes involved in nonhomologous end joining. All of the core nonhomologous end joining genes (Chang et al. 2017) are present in UTEX 250-A and C. reinhardtii, namely Ku70 and Ku80, and DNA ligase IV (LIG4) and XRCC4 (Supplementary Data Set 17). A gene encoding the DNA-dependent protein kinase catalytic subunit (DNA-PKcs) is present in C. reinhardtii but is absent in UTEX 250-A, as it also is absent in A. thaliana (Khan and Ochi 2023). DNA-PKcs appears to be absent in all Chlorellales genomes analyzed, and its absence is therefore unlikely to explain the effectiveness of homologous recombination in Auxenochlorella. Overall, we conclude that Auxenochlorella has the core complement of genes functioning in non-homologous end joining.

Discussion

Species that reproduce clonally as diploids may experience specific phenomena, including allodiploid hybridization, mitotic recombination, loss-of-heterozygosity, and aneuploidy. Although these processes can act as strong evolutionary forces, e.g. in speciation or adaptation to new environments, most of our knowledge is presently based upon yeasts and filamentous fungi. In green algae, observations consistent with these phenomena have been reported sporadically, including trisomy in C. primus (Lemieux et al. 2019), and chromosome-scale LOH and possible hybridization in T. obliquus (Biondi et al. 2024). Here, we find that each of these phenomena occurs in the trebouxiophyte genus Auxenochlorella, demonstrating the generality of these important “yeast-like” molecular and evolutionary processes in vegetatively diploid eukaryotes.

Auxenochlorella UTEX 250 is an allodiploid hybrid of two closely related species, A. protothecoides and A. symbiontica, that are differentiated by extensive karyotypic variation and possibly ecological niches. Following hybridization, the parental haplotypes have presumably been shuffled by mitotic crossovers, and homogenized via LOH events mediated by recombination, DNA repair, and potentially transient aneuploidy. Post-hybridization, the outcomes of these events can be driven by selection to resolve genomic incompatibilities between the divergent parental genomes in a process termed genome stabilization (Sipiczki 2008 ; Gabaldón 2020). It is presently unclear to what extent the crossovers, LOH events, and aneuploidies of UTEX 250 are involved in stabilization, although at least some of the LOH events that occurred prior to laboratory culture were presumably beneficial (e.g. at rDNA arrays). However, our observations from authentic strains of A. protothecoides and A. symbiontica suggest that LOH and aneuploidy are not just associated with hybridization: they are generally expected to be prominent evolutionary forces in Auxenochlorella that can occur on the timescale of laboratory culture. It is also unclear whether the sampling of UTEX 250 was a rare chance event or whether it may represent a successful hybrid lineage that is persistent and potentially abundant (albeit likely sterile, assuming a sexual life cycle). Given the number of Auxenochlorella strains in culture, and the potential to isolate new strains (Asker and Awad 2019; Chen et al. 2024), it would be interesting to apply population genomics approaches to investigate the prevalence of LOH, aneuploidy, and possibly other hybridization events in the genus.

The resources and genetic toolkit presented here will serve as a foundation for functional genomics and systems biology analyses, as well as the design of transformation constructs for molecular manipulation of the genome. Auxenochlorella has great potential for development as an algal model for fundamental discovery research and bioengineering, and we hope that it will complement existing reference systems. As introduced, C. reinhardtii is the premier green alga for forward and reverse genetics, with well-developed classical haploid genetics (Harris 2001), established methods for transforming the nuclear, plastid, and mitochondrial genomes (Boynton et al. 1988; Kindle et al. 1989; Kindle 1990; Randolph-Anderson et al. 1993; Shimogawara et al. 1998), well-annotated genome assemblies (Merchant et al. 2007; Craig et al. 2023), a plethora of genetic markers and molecular components (promoters, UTRs, etc.) (Crozet et al. 2018), ribonucleoprotein (RNP)-mediated gene editing (Nievergelt 2025), and extensive community resources including wild-type and mutant collections (Li et al. 2016; Li et al. 2019). Nonetheless, two major hurdles have inhibited routine transformation of the C. reinhardtii nuclear genome: the essentially random integration of transgenes by nonhomologous end joining, and the transcriptional silencing of transgenes (Schroda 2019; Nievergelt 2025). Recently, RNP-mediated gene-editing has been combined with scar-less homology-directed repair for transgene insertion at specific loci (Ferenczi et al. 2017; Akella et al. 2021; Nievergelt et al. 2023), substantially reducing position effects on transgene expression compared to un-targeted integration, and improving the proportion of transformants with seamless insertions at the recombination site up to 27% (Jacobebbinghaus et al. 2025). Mutant strains that robustly express transgenes have also been characterized (Neupert et al. 2020; Schroda and Remacle 2022).

Despite these advances, targeting of the Auxenochlorella nuclear genome by homologous recombination provides several advantages. Transformation only requires linear double-stranded DNA, in contrast to endonuclease-based methods that also depend on guide RNAs and DNA repair templates. Homologous recombination is not constrained by the requirement for double-strand breaks adjacent to Protospacer Adjacent Motif (PAM) sequences and thus provides greater flexibility in integration sites. Efficient integration by homologous recombination appears to be a general property of wild-type Auxenochlorella strains, and extends to species from the related genus Prototheca (Franklin et al. 2013; Moseley et al. 2024). Transgene integration occurs at high frequency via homologous recombination in the red algae C. merolae and Galdieria partita (Minoda et al. 2004; Fujiwara et al. 2013; Hirooka et al. 2022), but among green algae appears to be unique to Auxenochlorella and Prototheca. We have observed that the majority of Auxenochlorella transformants, typically 70% to 100%, exhibit precise, allele-specific integration at the targeted locus (Figs. 7 and 8). The upper limit for deletions is not yet defined, but in this work homology arms that are separated by more than 2 kb were used to delete the entire CDS of CHL27 and AMT1B (Figs. 7 and 8, Supplementary Fig. S12). Moreover, large cassettes containing up to four transgene expression modules, each with its own promoter, CDS and UTRs, have been integrated successfully (Bhat et al. 2025). However, we presently lack the ability in Auxenochlorella to simultaneously target multiple loci or both alleles of the same locus in a single transformation—an advantage offered by RNP-mediated approaches. Ideally, both methods would serve complementary roles depending on the application.

Beyond the versatility of genetic manipulation, several other features make Auxenochlorella an attractive reference system. Gene families containing multiple members in green algae, and especially in C. reinhardtii, more frequently feature just a single gene in Auxenochlorella UTEX 250-A. The reduced genetic redundancy is expected to improve the likelihood of achieving informative phenotypes following gene knockout or knockdown, analogous to the lower redundancy of regulatory gene families in the liverwort Marchantia polymorpha relative to A. thaliana and other angiosperms (Bowman et al. 2017). Although hundreds of genes and gene families have been entirely lost, including many genes that are expected to play roles in photosynthesis, these genes could be reintroduced to the Auxenochlorella genome to probe their functions in the species where they do occur. Mutations in genes involved in photosynthesis, such as CHL27 demonstrated herein, are also possible due to the facultative autotrophy of the species. The metabolic switch of the organism enables chloroplast biogenesis to be studied, and lends itself to the synthetic use of promoters induced in phototrophy or heterotrophy. Genomic manipulation also presents opportunities to address fundamental questions related to genetic mechanisms, as demonstrated by the dissection of the mechanism underlying the translation of bicistronic genes (Dueñas et al. 2025a). Auxenochlorella species are highly oleaginous and have GRAS status, and the genomic features reported here could also find utility in bioengineering. The introduction of artificial chromosomes is a powerful technique that enables multiple traits to be stacked at a single locus (Birchler et al. 2024). If the short AT-rich regions present on Auxenochlorella chromosomes can be confirmed to function as centromeres, it may be possible to maintain minichromosomes by introducing an AT-rich region, similar to the synthetic episomes that can be maintained in diatoms (Diner et al. 2017). Alternatively, the aneuploidy of UTEX 250-A and UTEX 25 could be exploited. We demonstrated allele-specific knock-in to a trisomic chromosome, and presumably it would be possible to sequentially manipulate this chromosome with minimal impact on fitness. Finally, since Auxenochlorella species are generally transformable, including the hydra symbiont A. symbiontica CCAP 211/61, the genetics of host/symbiont establishment and maintenance could be studied (Huss et al. 1994). Collectively, the yeast-like evolutionary biology and genetic manipulation of Auxenochlorella, thus far unique among green algae, presents many exciting prospects for discovery in plant biology and utilization in bioengineering.

Methods

Strains, media, and growth conditions

Auxenochlorella strains UTEX 250, UTEX 25, and UTEX 2341 were originally obtained from the University of Texas Culture Collection of Algae. A single colony of UTEX 250 was isolated and used in all experimental work (i.e. UTEX 250-A) except for CHL27 gene targeting, which was carried out in the original UTEX 250 strain. A. symbiontica CCAP 211/61 and Auxenochlorella CCAP 211/7C were obtained from the Culture Collection of Algae and Protozoa. Strains were typically grown in TAP (for DNA extraction) or HPv1 (for RNA) culture media using the trace element recipe of Kropat et al. (2011). HPv1 is a modified version of TP that replaces 20 mm Tris with 20 mm HEPES. Standard liquid cultures were generally grown under constant light (100 μmol photons m−2 s−1) and shaken at 200 rpm. Specific growth conditions were used for preparing RNA from UTEX 250-A for the generation of IsoSeq and RNAseq datasets. These included a range of photoautotrophic, mixotrophic, or heterotrophic conditions, different light conditions, CO2 levels, and elemental adjustments to media (Supplementary Data Set 18).

Extraction and sequencing of nucleic acids

UTEX 250-A DNA was extracted at UC Davis DNA Technologies Core from a frozen cell pellet and used to prepare and sequence PacBio HiFi and OmniC linked-read libraries. Individual IsoSeq libraries were produced from RNA extracted from cultures grown in autotrophic replete and heterotrophic with 2% (w/v) glucose conditions (HPv1 media), and each library was sequenced on a Sequel II SMRT cell at UC Davis. For all remaining conditions (Supplementary Data Set 18), Illumina RNAseq with poly(A) selection was performed on a Novoseq 6000 platform at UC Davis, generating stranded 150 bp paired-end reads.

UTEX 25, UTEX 2341, CCAP 211/61, and CCAP 211/7C were cultured mixotrophically in TAP medium. High molecular weight DNA was extracted using a modified version of a CTAB and Phenol:Chloroform protocol (Camacho et al. 2025). Cell pellets were ground in liquid nitrogen in a mortar and pestle prior to the CTAB incubation. Size selection was performed using the PacBio short read eliminator (SRE) kit following user instructions.

DNA from UTEX 25, UTEX 2341, and CCAP 211/61 was prepared in a multiplexed library along with 5 other unrelated samples and sequenced on a single PacBio Sequel II SMRT cell at QB3 Genomics, UC Berkeley. ONT sequencing for CCAP 211/61 and CCAP 211/7C was performed using the Ligation Sequencing Kit V14, two independent R10.1.4 flow cells, and a MinION Mk1B device following user instructions.

Genome assembly of Auxenochlorella UTEX 250-A

The UTEX 250-A PacBio HiFi reads were first subsampled to remove any reads shorter than 18 kb, resulting in ∼0.98 Gb of total sequence. An initial phased diploid assembly was produced by passing the filtered HiFi reads and Omni-C reads to Hifiasm v0.16.1-r375 (Cheng et al. 2022), which was run with default parameters. The initial assembly consisted of two sets of phased contigs, haplotype 1 and 2, featuring 17 and 21 nuclear contigs, respectively.

Several steps of manual assessment and assembly were then performed to achieve a final telomere-to-telomere phased diploid assembly. First, haplotype 1 and haplotype 2 contigs were aligned to each other using minimap2 v2.23-r1117 (Li 2021) with the parameter “-x asm20”. The raw PacBio HiFi reads were also mapped against each haplotype assembly with minimap2 (“-x map-hifi”). Assembly and read-based alignments were manually inspected using IGV v2.16.2 (Robinson et al. 2011). Due to the rearranged nature of the parental haplotypes, we found that the two automated haplotype sets did not individually represent a complete copy of the haploid genome (i.e. some genomic regions were present twice in one haplotype assembly, and absent from the other). We therefore transferred contigs between the two haplotype assemblies to arrive at two sets of contigs that each represented a complete haploid genome, which were arbitrarily labeled haplotype A and B.

Via comparison of the haplotype A and B assemblies, it was possible to identify contigs that could potentially be fused, i.e. at locations where the ends of two contigs from one haplotype mapped to a single assembled region from the other haplotype. Based on manual inspection of the raw reads, four pairs of contigs were successfully fused. This process involved either removing redundant sequence from one of the contig ends or adding additional sequence that filled the gap between the two contigs. Where gap filling was required, haplotype-specific raw reads spanning the gap were extracted, aligned with MAFFT v7.490 (Katoh and Standley 2013), and reduced to a single consensus sequence that was subsequently trimmed and inserted between the two contigs. Following a similar approach, the termini of several chromosomes were manually extended to reach the telomeric repeats. Uniquely mapped reads that exhibited soft-clipped bases extending beyond the assembled chromosome were extracted, aligned, and reduced to a consensus sequence that was trimmed and appended to the chromosome. These manual steps resulted in the A and B haplotypes each being represented by 12 gapless chromosomes that either terminated in telomeric repeats or rDNA arrays.

Two additional contigs represented the third copies of the trisomic chromosomes, C03 and C05. The contig corresponding to chromosome C05 was manually truncated at the location of the internal telomere (see Supplementary Fig. S7), since the original contig was misassembled as an exact duplicate of chromosome B05. Finally, 8 short contigs were removed from the assembly; 6 contigs entirely featured rDNA that is already represented as truncated arrays on the assembled chromosomes, and two contigs were identified as duplicate redundant sequences that are similarly represented in the chromosomal assembly. The assembly and phasing of the entire final assembly were manually verified by examining read alignments to each haplotype.

The plastome and mitogenome were each identified as single linear contigs in the Hifiasm assembly. Circularization was performed manually by identifying and removing redundant sequences from one end of each of the organellar contigs.

Oxford Nanopore basecalling and methylation analyses

ONT raw reads derived from CCAP 211/61 were basecalled using Dorado v0.3.2 (https://github.com/nanoporetech/dorado), which was run in super accuracy basecalling mode using the command “basecaller” and the model “dna_r10.4.1_e8.2_400bps_sup@v4.2.0”. ONT reads for CCAP 211/7C were basecalled using Dorado v0.6.0 using the options “sup,5mCG_5hmCG” or “sup,6mA” to perform basecalling in super accuracy mode with the detection of 5mC and 5-hydroxymethylcytosine (5hmC) calls at CpG dinucleotides, or 6mA calls, respectively. The resulting BAM files were converted to FASTQ using samtools 1.18–1 (Danecek et al. 2021) using the “-T ‘*’” option to retain base modifications. The resulting FASTQ files were mapped against the UTEX 250-A assembly using minimap2 with recommended parameters for ONT 10.4.1 chemistry (“-x map-ont -k19 -w19 -U50,500 -g10k -a -y”). The 5mC status of each CpG site was then called using modkit v0.2.7 (https://github.com/nanoporetech/modkit) with the command “pileup –preset traditional,” which aggregates methylation calls on both strands and returns only 5mC information. The 6mA status of each ApT site was similarly called using the command “pileup –motif AT 0.” Sites with less than 5 mapped reads were removed. For each site, the fraction of modified bases at each site was then extracted, and the median for each nucleotide position relative to the TSS was calculated across all genes that were predicted using IsoSeq data (see below).

Genome assembly of Auxenochlorella protothecoides UTEX 25 and UTEX 2341, and Auxenochlorella symbiontica CCAP 211/61

The UTEX 25, UTEX 2341, and CCAP 211/61 genome assemblies were produced using Hifiasm v0.19.8-r603. For UTEX 25 and UTEX 2341, all PacBio HiFi reads were passed to Hifiasm, which was run with default parameters. For CCAP 211/61, PacBio HiFi reads were supplemented with ONT reads, which were passed to Hifiasm with the flag “–ul”.

When run without linked-read data (e.g. Omni-C), Hifiasm produces a primary haploid assembly, as well as a diploid assembly consisting of two partially phased haplotypes. For each strain, we first mapped the two (pseudo)haplotypes of the diploid assembly to each other using minimap2 (“-x asm20”) and manually inspected the alignments to confirm that all contigs exhibited a one-to-one syntenic relationship between the two haplotypes (i.e. they are homologous without interchromosomal rearrangements as observed in UTEX 250-A). We then produced a representative haploid assembly by mapping the primary contigs to the UTEX 250-A genome assembly with minimap2 (“-x asm20”) and manually inspecting alignments. For CCAP 211/61, 14 primary contigs corresponded to 12 nuclear chromosomes and two circular organellar genomes. For both UTEX 25 and UTEX 2341, one pair of nuclear contigs and one pair of plastome contigs were each fused to produce the final assemblies, with all other major contigs corresponding to near-complete chromosomes. Contig fusion and circularization of the organellar contigs were performed using the manual assembly approaches described above. As with the UTEX 250-A assembly, additional short contigs featuring rDNA or redundant sequence were removed.

Quantification of heterozygosity and divergence among strains

To quantify heterozygosity (i.e. individual-level genetic diversity) between the A and B haplotypes of the UTEX 250-A assembly, the haplotype A and B chromosomes were aligned against each other using minimap2 (“-x asm20”). Variants segregating between the A and B haplotypes were then identified using the minimap2 script paftools.js and the parameters “call -l 2000 -L 10000” (i.e. a minimum alignment length to compute coverage of 2 kb, and a minimum alignment length to call variants of 10 kb). Chromosomes were divided into 10 kb windows, and heterozygosity was calculated based on the number of single-nucleotide variants (mapping quality of 60) relative to the number of alignable sites per window. Windows with <2,000 aligned sites were excluded. Zero-fold, 2-fold, and 4-fold degenerate sites were extracted from the coding sequence of the primary transcript of each A haplotype gene model using the degenotate tool (https://github.com/harvardinformatics/degenotate). All overall heterozygosity values were reported after removing homozygous genomic tracts that are putatively the product of LOH (see below).

Since the UTEX 25, UTEX 2341, and CCAP 211/61 genomes were assembled to the haploid level, a different approach was used to quantify heterozygosity. Raw PacBio HiFi reads for each strain were mapped against the A haplotype of the UTEX 250-A assembly using minimap2 (“-x map-hifi”). DeepVariant v1.6.0 (Poplin et al. 2018) was then used to call variants directly from the PacBio HiFi read alignments for each strain independently, using the parameter “model_type PACBIO”. Heterozygosity was calculated in 10 kb windows by identifying heterozygous single-nucleotide variants segregating among the reads of the strain in question. Callable sites for each strain were determined by mapping the haploid genome assembly of the relevant strain against the A haplotype of UTEX 250 with minimap2 (“-x asm20”) and identifying alignment blocks with paftools.js (“call -l 2000 -L 10000”). Genetic divergence for a given strain was calculated in 10 kb windows as the proportion of variant sites relative to the UTEX 250-A haplotype A. Divergence was averaged over both haplotypes for each strain (i.e. a site where only one of the two haplotypes varied relative to the UTEX 250-A reference contributed half as much to divergence as a site where both haplotypes varied). Divergence was also calculated relative to the UTEX 250 B haplotype using the same approach.

Neighbor joining trees for nuclear genome, plastome, and mitogenome sites were produced using MEGA v11.0.13 (Tamura et al. 2021) with the Tamura Nei substitution model and default parameters. For the organelles, either the plastome or mitogenome assembly of each strain was aligned to the UTEX 250-A plastome or mitogenome assembly with minimap2 (“-x asm20”), and variant and invariant sites were called with paftools.js (“call -l 2000 -L 10000”). To be included in the nuclear genome analysis, a site had to meet two criteria: (i) to be on a UTEX 250-A chromosome that had not undergone any reciprocal crossovers, and (ii) to be in a region that was not affected by a LOH event in any of the strains. These criteria were enforced to ensure that ancestral evolutionary relationships among haplotypes were recovered, and resulted in the analysis of 651,399 sites on chromosome A11. Variants were extracted from either the paftools.js analysis (for variants segregating between the UTEX 250 A and B haplotypes) or the DeepVariant analyses (for variants segregating within and between the other strains). Variants in UTEX 25, UTEX 2341, and CCAP 211/61 were phased relative to their respective genome assemblies (note that these may represent pseudohaplotypes with a small number of phase changes per chromosome).

Loss-of-heterozygosity

LOH events in the UTEX 250-A genome were called from the alignment of the A and B haplotype assemblies and paftools.js analysis (see above). LOH events were arbitrarily assigned to regions that featured no heterozygous sites over a stretch of 1 kb or more. Events were designated as terminal if they extended to a chromosome end, or interstitial if they fell within a chromosome.

LOH events were similarly called as invariant regions of at least 1 kb for UTEX 25, UTEX 2341, and CCAP 211/61 based on heterozygous variants identified from the DeepVariant analysis. CCAP 211/7C heterozygous sites and LOH events were also identified using DeepVariant, ran with the parameter “model_type ONT_R104” and supplied with an alignment of the raw ONT reads against the UTEX 250-A haplotype A assembly (minimap2 “-x map-ont”). These alignments were manually inspected to confirm that the karyotype of UTEX 250-A and CCAP 211/7C are identical.

Aneuploidy

The raw reads for all available strains were first mapped against the relevant genome assembly to calculate coverage. For PacBio datasets, the UTEX 250-A reads were mapped against the phased UTEX 250-A assembly, whereas UTEX 25, UTEX 2341, and CCAP 211/61 reads were mapped against the representative haplotype from their respective haploid assemblies, all using minimap2 (“-x map-hifi”). CCAP 211/7C ONT reads were mapped against the UTEX 250-A assembly using minimap2 (“-x map-ont”), and the “0710” reads were mapped against the UTEX 25 assembly using bwa mem (Li 2013). The coverage at each nucleotide was calculated using the samtools “depth” command, and the average coverage per 20-kb sliding window was calculated.

Auxenochlorella UTEX 250-A gene model annotation

Gene models were annotated using a hybrid approach that utilized both the IsoSeq and RNAseq datasets, followed by extensive manual curation. The A and B haplotypes of the UTEX 250-A assembly were annotated independently, except for the entirely homozygous chromosomes (04, 09, and 10).

IsoSeq reads were pre-processed using the IsoSeq3 pipeline v3.8.1 (https://github.com/PacificBiosciences/IsoSeq). Primers and poly(A) tails were first removed from circular consensus sequencing (CCS) reads. The resulting reads from the two libraries were then combined and mapped against the UTEX 250-A genome assembly using minimap2 (“-x splice:hq”). These read mappings were used to partition the reads into sets that either primarily mapped to the A haplotype (≥20 mapping quality), primarily mapped to the B haplotype, or mapped to neither with high quality (<20 mapping quality; this set was enriched for reads derived from homozygous genomic regions). Read clustering was then performed individually on each read set (“isoseq3 cluster –use-qvs”), to avoid clustering reads that derived from different alleles of the same gene. The three clustered read sets were then combined and aligned collectively to the UTEX 250-A assembly using pbmm2 v1.9.0 (“–preset ISOSEQ”). Unique isoforms were called using the isoseq3 collapse command with the options “–do-not-collapse-extra-5exons –max-5p-diff 20 –max-3p-diff 20.” These options were selected following manual assessment of the quality of the IsoSeq data and effectively enabled alternative TSSs (within a different exon of a longer isoform, or >20 bp away if on the same exon) and termination sites (>20 bp if on the same exon) to be preserved as independent isoforms.

Gene models were predicted from the isoseq3 output using SQANTI3 v5.1 in QC mode with default parameters (Pardo-Palacios et al. 2024), which utilizes GeneMarkS-T (Tang et al. 2015) to predict ORFs from isoforms. The isoseq3 pipeline groups neighboring genes under a single gene model if any of their isoforms overlap on the same strand, which is a frequent occurrence in compact genomes such as that of Auxenochlorella. We thus had to “decouple” the isoforms of incorrectly merged genes based on comparison of the protein-coding coordinates of each isoform (i.e. isoforms that did not share coding sequence were separated into independent gene models). Next, we removed any isoforms that were supported by <5 reads or that were derived from <5% of the total reads associated with a given gene. This step minimized the number of false isoforms derived from 5′ truncated reads. Finally, the primary isoform for each gene was selected based on a combination of read abundance and length criteria. For the majority of genes with multiple isoforms, the isoform with the longest coding sequence was also the most abundant and was deemed the primary isoform (where multiple isoforms had the same coding length but differed in their UTRs, the most abundant was selected). For genes where the longest coding sequence was not the most abundant, it was deemed the primary isoform if it was supported by ≥50% of the reads associated with the most abundant isoform. If this were not the case, the most abundant isoform was selected if its coding length was ≥80% of the isoform with the longest coding length. In all other cases, the longest isoform was selected to avoid overly short isoforms being selected. since they may be overrepresented in IsoSeq data.

We noticed that many ORFs were truncated due to predicted initiation at an internal ATG codon. To perform ORF extension, we first built a model of the Kozak sequence surrounding annotated initiation sites from primary transcripts that had no in-frame upstream ATG codons (i.e. ORFs that could not be extended), based on the assumption that most of these annotations represented true initiation sites. The strength of the Kozak sequence surrounding in-frame upstream ATG codons for the remaining transcripts was then calculated by comparison to this Kozak model (see Cross (2016) and Dueñas et al. (2025a) for details). ORF extension was performed if an upstream in-frame start codon existed with a Kozak score greater than the 25th percentile of the scores from which the Kozak model was built, or if the upstream Kozak score was simply higher than that of the annotated start codon.

In parallel to the above annotation, we used high-confidence gene models derived from the photoautotrophic IsoSeq dataset to train the AUGUSTUS v3.3.2 gene predictor tool (Stanke et al. 2008), following the protocols described by Hoff and Stanke (2019). Pre-processing and gene model annotation using isoseq3 and SQANTI3 were performed as above. The training set of gene models was produced by extracting the most abundant isoform supported by a minimum of 10 reads from gene models derived from IsoSeq reads that mapped uniquely to the A haplotype. Accurate AUGUSTUS training requires information on the noncoding DNA flanking the training set genes. To avoid including unannotated genes in the flanking regions, we performed a preliminary de novo gene annotation using BRAKER v2.1.2 (Hoff et al. 2016). Each RNAseq dataset was mapped against the UTEX 250-A assembly using STAR v2.7.10a (“–twopassMode Basic –alignIntronMax 5000”) (Dobin et al. 2013), and subsequently passed to the braker.pl script using default parameters. We then extracted BRAKER gene models that had no overlap with the IsoSeq-derived training gene set using bedtools v2.30.0 (Quinlan and Hall 2010). The combined dataset of IsoSeq training genes and nonoverlapping BRAKER genes was then used to compute the flanking region length (1,366 bp), and the final training set was produced by extracting the training set genes with flanking DNA. This dataset was then split randomly into a gene set for training (N = 3,444 genes) and for testing the trained parameters (N = 1,000). Training was performed using both the standard hidden-Markov model and the conditional random field (CRF) approach, which can result in more accurate gene prediction but is less robust to errors in the training dataset. We selected the CRF parameters based on increased sensitivity when applied to the test dataset. A final RNAseq-based gene model prediction was then performed by passing the RNAseq BAM files (see above) to BRAKER using the trained CRF parameters (“—useexisting”).

The IsoSeq and AUGUSTUS-based gene models were then semimanually integrated. IsoSeq-based models were given precedence, and an AUGUSTUS model was automatically retained if it did not share any same-strand coding exons with an IsoSeq model. AUGUSTUS models that exactly matched an IsoSeq model (with the exception of the first and/or last exon, which was often incorrectly predicted by AUGUSTUS) were immediately filtered. All other AUGUSTUS models, which could have partial overlap with an IsoSeq model, were manually curated using all IsoSeq and RNAseq evidence. Alternative transcripts were analyzed by first removing isoforms that differ only in their UTR sequences, before passing the resulting annotation to SUPPA v2.4 (Trincado et al. 2018) run with the command “-f ioe -e SE RI SS” to classify alternative isoforms into the categories “skipping exon,” “retained intron,” and “alternative splice site”.

Annotation of the plastome and mitogenome was performed using the GeSeq webserver (Tillich et al. 2017), which was provided with existing annotations from A. protothecoides “0710” and Prototheca wickerhamii (Yan et al. 2015). Additional genes were manually annotated by searching for long ORFs in the unannotated sequence.

Repeat annotation

Preliminary annotation of interspersed repeats was performed by running RepeatModeler 2.0.5 (Flynn et al. 2020) on the UTEX 250-A genome with the parameter “–LTRStruct.” Consensus sequences for specific transposon families were then manually curated following Goubert et al. (2022) (see Supplementary Data Set 3, Supplementary Fig. S2). The manually curated models were then combined with the automated models and passed to RepeatMasker v4.1.6 (https://github.com/Dfam-consortium/RepeatMasker) (“–gccalc”). Tandem Repeats Finder v4.10.0 was run on the UTEX 250-A genome to identify tandem repeats using the parameters “2 7 7 80 10 50 2000 -f -d -m -ngs”. Total repeat density was calculated by combining the output of RepeatMasker and TRF. Transposon genes (see Table 1, Supplementary Fig. S2) were identified and marked in the GFF3 annotation by intersecting the coordinates of manually curated transposons and gene model coding sequence, followed by manual inspection.

Repeat densities for the genome assemblies in Table 1 were similarly performed by combining the output of RepeatModeler/RepeatMasker and TRF. RepeatModeler was not run for A. protothecoides 0710, which was repeat-masked using the UTEX 250-A repeat library (see above), nor C. reinhardtii, for which an extensive repeat library exists (Craig et al. 2021).

Gene family and phylogenetic analyses

Gene family analyses were performed using the 19 species shown in Fig. 1A. Protein sequences for each species were reduced to their primary isoforms, and, where possible, genes on the organelle genomes were removed. Annotated transposon proteins were included if known. For UTEX 250-A, all A haplotype genes were arbitrarily selected and supplemented by genes that are putatively unique to the B haplotype (see below). For Picochlorum sp. BPE23, the genes from only haplotype A were arbitrarily used. Metadata for protein datasets are presented in Supplementary Data Set 19.

OrthoFinder v2.5.5 was run on the 19 protein sets using the Diamond search method (“-S diamond_ultra_sens”) (Emms and Kelly 2019). Orthogroups featuring a known transposon protein were filtered out (the results of which are reflected in Table 1). GreenCut2 and CiliaCut orthologs were extracted for UTEX 250-A and C. vulgaris by comparison to C. reinhardtii (i.e. from the “orthologues” output directory of OrthoFinder). For the specific genes shown in Fig. 5, E and F (LHCA, NPQ, meiosis, and syngamy), the gene trees for the underlying orthogroups were manually validated.

The species tree in Fig. 1A was produced from a concatenated alignment of the 669 single-copy orthologs identified by OrthoFinder. For each ortholog, proteins were aligned using MAFFT v7.525 (“–maxiterate 1000 –localpair”), and the resulting alignments were trimmed using trimAl v1.4.rev22 (“–automated1”) (Capella-Gutiérrez et al. 2009). The trimmed alignments were concatenated and a maximum likelihood phylogeny was produced using IQ-TREE v2.3.0 (“-m MFP -bb 1000”) (Minh et al. 2020). For Fig. 1B, rDNA sequences were accessed from NCBI (Supplementary Data Set 1), aligned with MAFFT (“L-INS-i”), manually trimmed, and passed to IQ-TREE as above.

Gene IDs and functional annotation

Allelic gene pairs were determined by running SynChro (Drillon et al. 2014), a tool for detecting syntenic orthologs between genomes, on the A and B haplotype genomes and gene models. Genes were assigned a unique 5-digit number starting with “00005” for the first gene on chromosome A01 (i.e. UTEX250_A01.00005), with each subsequent gene along the A chromosomes receiving a number that increased in increments of 5 (i.e. UTEX250_A01.00010, UTEX250_A01.00015, etc.). Genes on the B haplotype that had been identified as part of an allelic pair were assigned the gene number of their corresponding A gene, regardless of their order on the B chromosomes. Genes that were determined to be unique to haplotype B were given new gene numbers that extended beyond the last number that was used on haplotype A. Genes on the plastome were given unique numbers starting at “80000” (e.g. UTEX250_xCp.80000), and genes on the mitogenome starting at “90000” (e.g. UTEX250_xMt.90000). Isoforms were distinguished by adding a number to the gene ID, with the primary isoform labeled as “1,” e.g. UTEX250_A01.00030.1.

In addition to IDs, genes were also assigned a unique name or “gene symbol” where available, following the nomenclature of C. reinhardtii (Craig et al. 2023). Gene symbols were transferred based on the gene orthology relationships between C. reinhardtii and Auxenochlorella UTEX 250-A determined by OrthoFinder (“Orthologues” output directory). For genes with 1:1 orthology the C. reinhardtii symbol was directly transferred to UTEX 250-A. For multicopy gene families in C. reinhardtii that are represented by a single gene in UTEX 250-A, either the symbol with the lowest number was used (e.g. FOX1/FOX2 gene pair orthologous to FOX1 in UTEX 250-A), or if letters are used, they were removed (e.g. CMT1A/CMT1B gene pair orthologous to CMT1 in UTEX 250-A). For the reverse situation, letters were appended to the UTEX 250-A gene symbol where possible (e.g. LPAAT1 orthologous to LPAAT1A/LPAAT1B gene pair in UTEX 250-A). Genes on the nuclear genome in C. reinhardtii but on an organelle genome in UTEX 250-A were changed accordingly (e.g. CHLI1 orthologous to chlI in UTEX 250-A plastome). Several gene symbols were manually curated, including a small number of genes that are absent from C. reinhardtii (e.g. the photosystem I subunit gene psaM), and all of the histone genes (for which orthology relationships are difficult to determine). All gene symbol relationships between C. reinhardtii and UTEX 250-A are presented in Supplementary Data Set 7.

Protein domains were identified by running InterProScan v5.67–99.0 (“-dp -goterms”).

Identification of antisense long noncoding RNA genes

LncRNA genes were annotated based on IsoSeq data using two approaches. First, IsoSeq-based gene models that had initially been annotated as protein-coding but had no orthology to proteins from other species, nor a recognized Pfam or CDD domain, were reassigned as lncRNAs if they overlapped the coding sequence of another gene that did meet one of these criteria. Second, IsoSeq gene models supported by at least 5 reads for which no ORF had been annotated by SQANTI3 were added. The first category of genes had been assigned gene IDs as described above, and retained these IDs as lncRNAs instead of protein-coding genes. New gene IDs starting from “50000” and increasing in increments of 5 were introduced for the second category of lncRNA genes.

The coordinates of protein-coding genes with either a homolog in at least one other species or a recognized domain were then intersected with the lncRNA gene coordinates, and those with antisense overlap spanning at least 30% of the protein-coding gene length were selected for manual curation. Validated gene models with antisense lncRNAs were then divided into those with internal birdirectional promoters, or those with internal or immediately adjacent promoters without evidence for birdirectional activity. Gene ontology enrichment analysis was performed using topGO (Alexa and Rahnenfuhrer 2023) using the GO term output from InterProScan and the classic Fisher Test option.

Genetic toolkit

Transformation, fluorescence measurements, and microscopy were performed following Dueñas et al. (2025a) and Dueñas et al. (2025b). Full details are provided in Supplementary File 1, the sequence of all primers used in this study can be found in Supplementary Data Set 20, and the sequences of plasmids as a GenBank file in Supplementary File 2.

Accession numbers

NCBI accession numbers for all Auxenochlorella UTEX 250-A genes named in the main text are available in Supplementary Data Set 21.

Supplementary Material

koaf259_Supplementary_Data

Acknowledgments

Confocal fluorescence microscopy was performed at the CRL Molecular Imaging Center, RRID:SCR_017852. We thank two anonymous reviewers for their insightful comments on an earlier version of the manuscript.

Contributor Information

Rory J Craig, California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA 94720, USA; School of BioSciences, University of Melbourne, Parkville, VIC 3010, Australia.

Marco A Dueñas, Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA.

Dimitrios J Camacho, Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA.

Sean D Gallaher, California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA 94720, USA.

Maria Clara Avendaño-Monsalve, California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA 94720, USA.

Yang-Tsung Lin, California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA 94720, USA.

Crysten E Blaby-Haas, Molecular Foundry, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.

Jeffrey L Moseley, California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA 94720, USA.

Sabeeha S Merchant, California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA 94720, USA; Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA; Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA; Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.

Author contributions

R.J.C., J.L.M., and S.S.M. designed the research. R.J.C., M.A.D., D.L.C., S.D.G., M.C.A.-M., Y.-T.L., and J.L.M. performed the research. R.J.C., M.A.D., S.D.G., M.C.A.-M., C.L.B.-H., and J.L.M. analyzed the data. R.J.C., M.A.D., J.L.M., and S.S.M. wrote the paper.

Supplementary data

The following materials are available in the online version of this article.

Supplementary Figure S1 . Permuted tRNA genes in Auxenochlorella UTEX 250-A.

Supplementary Figure S2 . IGV screenshots of transposable element insertions that are polymorphic between the A and B haplotypes.

Supplementary Figure S3 . Putative AT-rich short regional centromeres in Auxenochlorella UTEX 250-A Haplotype A.

Supplementary Figure S4 . Map of Auxenochlorella UTEX250-A plastome.

Supplementary Figure S5 . Map of Auxenochlorella UTEX250-A mitogenome.

Supplementary Figure S6 . UTEX 250-A and CCAP 211/7C chromosome schematics and heterozygosity.

Supplementary Figure S7 . IGV screenshot of putative de novo telomere on duplicate fragment of chromosome B05.

Supplementary Figure S8 . PacBio read coverage in 20-kb windows across a representative haplotype of the Auxenochlorella symbiontica CCAP 211/61 and Auxenochlorella protothecoides UTEX 2341 genome assemblies.

Supplementary Figure S9 . Putative RBCX orthologs in Auxenochlorella UTEX 250-A and Chlorella vulgaris.

Supplementary Figure S10 . 5-methylcytosine and N6-methyladenine around transcription start sites of UTEX 250-A genes.

Supplementary Figure S11 . IGV screenshot of GEX1 and antisense lncRNA.

Supplementary Figure S12 . Auxenochlorella reverse genetics.

Supplementary Figure S13 . Auxenochlorella selection markers and intracellular localization.

Supplementary Data Set 1 . Auxenochlorella strain metadata for Fig. 1B.

Supplementary Data Set 2 . Summary statistics for existing Auxenochlorella genome assemblies.

Supplementary Data Set 3 . Annotation notes for manually curated transposable elements and other interspersed repeats in the Auxenochlorella UTEX 250-A genome.

Supplementary Data Set 4 . Possible centromere coordinates for the Auxenochlorella UTEX 250-A genome.

Supplementary Data Set 5 . Summary statistics for Auxenochlorella protothecoides and Auxenochlorella symbiontica haploid genome assemblies.

Supplementary Data Set 6 . Percentage of UTEX 250-A proteins with functional domains as determined by InterProScan.

Supplementary Data Set 7 . Auxenochlorella UTEX 250-A genes and gene symbols and their orthology to C. reinhardtii genes and gene symbols.

Supplementary Data Set 8 . Auxenochlorella UTEX 250-A heterozygosity by site class.

Supplementary Data Set 9 . Auxenochlorella UTEX 250-A orthogroup analysis.

Supplementary Data Set 10 . BUSCO genes that have been lost from the Auxenochlorella UTEX 250-A genome.

Supplementary Data Set 11 . Expanded orthogroups in Auxenochlorella UTEX 250-A genome featuring tandem duplication of genes.

Supplementary Data Set 12 . GreenCut2 genes that are absent in the Auxenochlorella UTEX 250-A genome.

Supplementary Data Set 13 . Presence and absence of LHCA genes and specific genes involved in nonphotochemical quenching across Trebouxiophyceae.

Supplementary Data Set 14 . CiliaCut genes that are absent in the Auxenochlorella UTEX 250-A genome.

Supplementary Data Set 15 . UTEX 250-A protein-coding genes with IsoSeq evidence for antisense lncRNA transcription.

Supplementary Data Set 16 . Significant GO terms for genes with antisense lncRNA transcription.

Supplementary Data Set 17 . Core genes involved in non-homologous end joining.

Supplementary Data Set 18 . UTEX 250-A growth conditions for RNA extraction.

Supplementary Data Set 19 . Species used in Orthofinder analyses.

Supplementary Data Set 20 . Primers for genotyping.

Supplementary Data Set 21 . NCBI accessions for Auxenochlorella UTEX 250-A genes named in the main text.

Supplementary File 1 . Supplementary methods for the Auxenochlorella genetic toolkit.

Supplementary File 2 . Genbank file of plasmid sequences used in this study.

Supplementary File 3 . FASTA file of sequences associated with Fig. 1A.

Supplementary File 4 . Newick file of Fig. 1A phylogeny.

Supplementary File 5 . FASTA file of sequences associated with Fig. 1B.

Supplementary File 6 . Newick file of Fig. 1B phylogeny.

Supplementary File 7 . FASTA file of sequences associated with Fig. 2D.

Supplementary File 8 . Newick file of Fig. 2D phylogeny.

Supplementary File 9 . FASTA file of sequences associated with Fig. 2E.

Supplementary File 10 . Newick file of Fig. 2E phylogeny.

Supplementary File 11 . FASTA file of sequences associated with Fig. 2F.

Supplementary File 12 . Newick file of Fig. 2F phylogeny.

Supplementary File 13 . FASTA file of sequences associated with Fig. 7A.

Supplementary File 14 . Newick file of Fig. 7A phylogeny.

Funding

This work was supported by the U.S. Department of Energy Office of Science, Biological and Environmental Research program under award no. DE-SC0023027 (to SSM and JLM) with support from the Gordon and Betty Moore Foundation (to SSM) for the work on A. symbiontica (9203). RJC was supported for his work on Auxenochlorella genome sequencing and assembly in Berkeley, in part, by Laboratory Directed Research and Development (to SSM) under U.S. Department of Energy Contract No. DE-AC02-05CH11231. MAD was supported, in part, by the National Institutes of Health under award numbers 1T32GM132022-01 and 1F31GM157804, in addition to the University of California, Berkeley, Newton Graduate Fellowship in Synthetic Biology (QB3-Berkeley). DJC was supported in part by the National Institutes of Health award no. 5T32GM007232-44 and the University of California, Berkeley, Chancellor's Fellowship. Work at the Molecular Foundry was supported by the Office of Science, Office of Basic Energy Sciences, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231 (CEB-H). Work at the Joint Genome Institute facility (https://ror.org/04xm1d337), is supported by the U.S. Department of Energy Office of Science operated under Contract No. DE-AC02-05CH11231 (CEB-H).

Conflict of interest statement

J.L.M. served as a paid consultant for Phycoil Biotechnology International, Inc. (PBI), a company that uses Auxenochlorella and other microorganisms to develop therapeutic and nutritional oils. PBI had no role in the design, execution, data collection, analysis, or interpretation of this study, nor was the company involved in the preparation, review, or approval of this manuscript. All authors declare that they have no other conflicts of interest to disclose related to this publication.

Data availability

All sequencing data associated with Auxenochlorella UTEX 250-A is available from NCBI under the BioSample SAMN45466464. The UTEX 250-A genome and gene annotations are specifically available under the BioProjects PRJNA1195245 (Haplotype A) and PRJNA1195244 (Haplotype B). Haploid genome assemblies and sequencing data for UTEX 25, UTEX 2341, CCAP 211/61 and CCAP 211/7A (sequencing data only) are available from NCBI under the BioProject PRJNA1328465. Alignment and tree files for phylogenetic analysis are provided as Supplementary Files S3 to S14.

Dive Curated Terms

The following phenotypic, genotypic, and functional terms are of significance to the work described in this paper:

References

  1. Akella  S, Ma  X, Bacova  R, Harmer  ZP, Kolackova  M, Wen  X, Wright  DA, Spalding  MH, Weeks  DP, Cerutti  H. Co-targeting strategy for precise, scarless gene editing with CRISPR/Cas9 and donor ssODNs in Chlamydomonas. Plant Physiol. 2021:187(4):2637–2655. 10.1093/plphys/kiab418 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Alexa  A, Rahnenfuhrer  J. 2023. topGO: enrichment analysis for gene ontology. R package version 2.54.0. 10.18129/B9.bioc.topGO.R [DOI]
  3. Allen  MD, del Campo  JA, Kropat  J, Merchant  SS. FEA1, FEA2, and FRE1, encoding two homologous secreted proteins and a candidate ferrireductase, are expressed coordinately with FOX1 and FTR1 in iron-deficient Chlamydomonas reinhardtii. Eukaryot Cell. 2007:6(10):1841–1852. 10.1128/EC.00205-07 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Allorent  G, Lefebvre-Legendre  L, Chappuis  R, Kuntz  M, Truong  TB, Niyogi  KK, Ulm  R, Goldschmidt-Clermont  M. UV-B photoreceptor-mediated protection of the photosynthetic machinery in Chlamydomonas reinhardtii. Proc Natl Acad Sci U S A. 2016:113(51):14864–14869. 10.1073/pnas.1607695114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Almagro Armenteros  JJ, Salvatore  M, Emanuelsson  O, Winther  O, von Heijne  G, Elofsson  A, Nielsen  H. Detecting sequence signals in targeting peptides using deep learning. Life Sci Alliance. 2019:2(5):e201900429. 10.26508/lsa.201900429 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Altmannova  V, Firlej  M, Muller  F, Janning  P, Rauleder  R, Rousova  D, Schaffler  A, Bange  T, Weir  JR. Biochemical characterisation of Mer3 helicase interactions and the protection of meiotic recombination intermediates. Nucleic Acids Res. 2023:51(9):4363–4384. 10.1093/nar/gkad175 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Andersen  MP, Nelson  ZW, Hetrick  ED, Gottschling  DE. A genetic screen for increased loss of heterozygosity in Saccharomyces cerevisiae. Genetics. 2008:179(3):1179–1195. 10.1534/genetics.108.089250 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Arriola  MB, Velmurugan  N, Zhang  Y, Plunkett  MH, Hondzo  H, Barney  BM. Genome sequences of Chlorella sorokiniana UTEX 1602 and Micractinium conductrix SAG 241.80: implications to maltose excretion by a green alga. Plant J. 2018:93(3):566–586. 10.1111/tpj.13789 [DOI] [PubMed] [Google Scholar]
  9. Asker  D, Awad  TS. Isolation and characterization of a novel lutein-producing marine microalga using high throughput screening. Food Res Int. 2019:116(February):660–667. 10.1016/j.foodres.2018.08.093. [DOI] [PubMed] [Google Scholar]
  10. Bakuła  Z, Siedlecki  P, Gromadka  R, Gawor  J, Gromadka  A, Pomorski  JJ, Panagiotopoulou  H, Jagielski  T. A first insight into the genome of Prototheca wickerhamii, a major causative agent of human protothecosis. BMC Genomics. 2021:22(1):168. 10.1186/s12864-021-07491-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Bao  W, Jurka  J. Homologues of bacterial TnpB_IS605 are widespread in diverse eukaryotic transposable elements. Mob DNA. 2013:4(1):12. 10.1186/1759-8753-4-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Barten  R, van Workum  D-IM, de Bakker  E, Risse  J, Kleisman  M, Navalho  S, Smit  S, Wijffels  RH, Nijveen  H, Barbosa  MJ. Genetic mechanisms underlying increased microalgal thermotolerance, maximal growth rate, and yield on light following adaptive laboratory evolution. BMC Biol. 2022:20(1):242. 10.1186/s12915-022-01431-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Becker  SA, Spreafico  R, Kit  JL, Brown  R, Likhogrud  M, Fang  W, Posewitz  MC, Weissman  JC, Radakovits  R. Phased diploid genome sequence for the fast-growing microalga Picochlorum celeri. Microbiol Resour Announc. 2020:9(20):e00087–e00020. 10.1128/MRA.00087-20 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Benites  LF, Bucchini  F, Sanchez-Brosseau  S, Grimsley  N, Vandepoele  K, Piganeau  G. Evolutionary genomics of sex-related chromosomes at the base of the green lineage. Genome Biol Evol. 2021:13(10):evab216. 10.1093/gbe/evab216 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Berthold  P, Schmitt  R, Mages  W. An engineered Streptomyces hygroscopicus aph 7” gene mediates dominant resistance against hygromycin B in Chlamydomonas reinhardtii. Protist. 2002:153(4):401–412. 10.1078/14344610260450136 [DOI] [PubMed] [Google Scholar]
  16. Bhat  R, Im  BH, Kim  J, Im  CS. Microalgae compositions and methods for treating disease. World Intellectual Property Organization application number WO2025101f37A1, filed November 1, 2024, and published May 15, 2025
  17. Biondi  TC, Kruse  CPS, Koehler  SI, Kwon  T, Davis  AK, Eng  W, Kunde  Y, Gleasner  CD, Mak  KTY, Polle  J, et al.  The telomere-to-telomere, gapless, phased diploid genome and methylome of the green alga Scenedesmus obliquus UTEX 3031 reveals significant heterozygosity and genetic divergence of the haplotypes. Algal Res.. 2024:79(April):103431. 10.1016/j.algal.2024.103431 [DOI] [Google Scholar]
  18. Birchler  JA, Kelly  J, Singh  J, Liu  H, Zhang  Z, Char  SN, Sharma  M, Yang  H, Albert  PS, Yang  B. Synthetic minichromosomes in plants: past, present, and promise. Plant J. 2024:120(6):2356–2366. 10.1111/tpj.17142 [DOI] [PubMed] [Google Scholar]
  19. Blaby  IK, Blaby-Haas  CE, Tourasse  N, Hom  EF, Lopez  D, Aksoy  M, Grossman  A, Umen  J, Dutcher  S, Porter  M, et al.  The Chlamydomonas genome project: a decade on. Trends Plant Sci. 2014:19(10):672–680. 10.1016/j.tplants.2014.05.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Blaby-Haas  CE, Merchant  SS. Comparative and functional algal genomics. Annu Rev Plant Biol. 2019:70(1):605–638. 10.1146/annurev-arplant-050718-095841 [DOI] [PubMed] [Google Scholar]
  21. Blanc  G, Agarkova  I, Grimwood  J, Kuo  A, Brueggeman  A, Dunigan  DD, Gurnon  J, Ladunga  I, Lindquist  E, Lucas  S, et al.  The genome of the polar eukaryotic microalga Coccomyxa subellipsoidea reveals traits of cold adaptation. Genome Biol. 2012:13(5):R39. 10.1186/gb-2012-13-5-r39 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Blanc  G, Duncan  G, Agarkova  I, Borodovsky  M, Gurnon  J, Kuo  A, Lindquist  E, Lucas  S, Pangilinan  J, Polle  J, et al.  The Chlorella variabilis NC64A genome reveals adaptation to photosymbiosis, coevolution with viruses, and cryptic sex. Plant Cell. 2010:22(9):2943–2955. 10.1105/tpc.110.076406 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Bochtler  M, Fernandes  H. DNA adenine methylation in eukaryotes: enzymatic mark or a form of DNA damage?  Bioessays. 2021:43(3):e2000243. 10.1002/bies.202000243 [DOI] [PubMed] [Google Scholar]
  24. Bowman  JL, Kohchi  T, Yamato  KT, Jenkins  J, Shu  S, Ishizaki  K, Yamaoka  S, Nishihama  R, Nakamura  Y, Berger  F, et al.  Insights into land plant evolution garnered from the Marchantia polymorpha genome. Cell. 2017:171(2):287–304.e15. 10.1016/j.cell.2017.09.030 [DOI] [PubMed] [Google Scholar]
  25. Boynton  JE, Gillham  NW, Harris  EH, Hosler  JP, Johnson  AM, Jones  AR, Randolph-Anderson  BL, Robertson  D, Klein  TM, Shark  KB, et al.  Chloroplast transformation in Chlamydomonas with high velocity microprojectiles. Science. 1988:240(4858):1534–1538. 10.1126/science.2897716 [DOI] [PubMed] [Google Scholar]
  26. Bracher  A, Hauser  T, Liu  C, Hartl  FU, Hayer-Hartl  M. Structural analysis of the rubisco-assembly chaperone RbcX-II from Chlamydomonas reinhardtii. PLoS One. 2015:10(8):e0135448. 10.1371/journal.pone.0135448 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Brooks  G, Franklin  S, Avila  J, Decker  SM, Baliu  E, Rakitsky  W, Piechocki  J, Zdanis  D, Norris  LM. Microalgal flour. United States US12059006B2, filed November 5 2021, and issued July 26, 2024.
  28. Bulankova  P, Sekulić  M, Jallet  D, Nef  C, van Oosterhout  C, Delmont  TO, Vercauteren  I, Osuna-Cruz  CM, Vancaester  E, Mock  T, et al.  Mitotic recombination between homologous chromosomes drives genomic diversity in diatoms. Curr Biol. 2021:31(15):3221–3232.e9. 10.1016/j.cub.2021.05.013 [DOI] [PubMed] [Google Scholar]
  29. Calhoun  S, Bell  TAS, Dahlin  LR, Kunde  Y, LaButti  K, Louie  KB, Kuftin  A, Treen  D, Dilworth  D, Mihaltcheva  S, et al.  A multi-omic characterization of temperature stress in a halotolerant Scenedesmus strain for algal biotechnology. Commun Biol. 2021:4(1):333. 10.1038/s42003-021-01859-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Camacho  DJ, Craig  RJ, Merchant  SS.Extraction and sequencing of high molecular weight genomic DNA from Auxenochlorella protothecoides using the Oxford Nanopore Technologies MinION platform. doi.org/10.17504/protocols.io.14egn6r8ql5d/v1. Protocols.io, 2025.
  31. Capella-Gutiérrez  S, Silla-Martínez  JM, Gabaldón  T. Trimal: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009:25(15):1972–1973. 10.1093/bioinformatics/btp348 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Caspy  I, Malavath  T, Klaiman  D, Fadeeva  M, Shkolnisky  Y, Nelson  N. Structure and energy transfer pathways of the Dunaliella Salina photosystem I supercomplex. Biochim Biophys Acta Bioenerg. 2020:1861(10):148253. 10.1016/j.bbabio.2020.148253 [DOI] [PubMed] [Google Scholar]
  33. Cecchin  M, Marcolungo  L, Rossato  M, Girolomoni  L, Cosentino  E, Cuine  S, Li-Beisson  Y, Delledonne  M, Ballottari  M. Chlorella vulgaris genome assembly and annotation reveals the molecular basis for metabolic acclimation to high light conditions. Plant J. 2019:100(6):1289–1305. 10.1111/tpj.14508 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Chan  PP, Lin  BY, Mak  AJ, Lowe  TM. tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res. 2021:49(16):9077–9096. 10.1093/nar/gkab688 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Chang  HHY, Pannunzio  NR, Adachi  N, Lieber  MR. Non-homologous DNA end joining and alternative pathways to double-strand break repair. Nat Rev Mol Cell Biol. 2017:18(8):495–506. 10.1038/nrm.2017.48 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Charron  G, Marsit  S, Henault  M, Martin  H, Landry  CR. Spontaneous whole-genome duplication restores fertility in interspecific hybrids. Nat Commun. 2019:10(1):4126. 10.1038/s41467-019-12041-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Chaux-Jukic  F, O’Donnell  S, Craig  RJ, Eberhard  S, Vallon  O, Xu  Z. Architecture and evolution of subtelomeres in the unicellular green alga Chlamydomonas reinhardtii. Nucleic Acids Res. 2021:49(13):7571–7587. 10.1093/nar/gkab534 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Chen  GYE, Adams  NBP, Jackson  PJ, Dickman  MJ, Hunter  CN. How the O2-dependent mg-protoporphyrin monomethyl ester cyclase forms the fifth ring of chlorophylls. Nat Plants. 2021:7(3):365–375. 10.1038/s41477-021-00876-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Chen  H, Sosa  A, Chen  F. Growth and cell size of microalga Auxenochlorella protothecoides AS-1 under different trophic modes. Microorganisms. 2024:12(4):835. 10.3390/microorganisms12040835 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Cheng  H, Jarvis  ED, Fedrigo  O, Koepfli  K-P, Urban  L, Gemmell  NJ, Li  H. Haplotype-resolved assembly of diploid genomes without parental data. Nat Biotechnol. 2022:40(9):1332–1335. 10.1038/s41587-022-01261-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Christmann  M, Kaina  B. Transcriptional regulation of human DNA repair genes following genotoxic stress: trigger mechanisms, inducible responses and genotoxic adaptation. Nucleic Acids Res. 2013:41(18):8403–8420. 10.1093/nar/gkt635 [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Costas  AM, White  AK, Metcalf  WW. Purification and characterization of a novel phosphorus-oxidizing enzyme from Pseudomonas stutzeri WM88. J Biol Chem. 2001:276(20):17429–17436. 10.1074/jbc.M011764200 [DOI] [PubMed] [Google Scholar]
  43. Craig  RJ, Gallaher  SD, Shu  S, Salome  PA, Jenkins  JW, Blaby-Haas  CE, Purvine  SO, O’Donnell  S, Barry  K, Grimwood  J, et al.  The Chlamydomonas genome project, version 6: reference assemblies for mating-type plus and minus strains reveal extensive structural mutation in the laboratory. Plant Cell. 2023:35(2):644–672. 10.1093/plcell/koac347 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Craig  RJ, Hasan  AR, Ness  RW, Keightley  PD. Comparative genomics of Chlamydomonas. Plant Cell. 2021:33(4):1016–1041. 10.1093/plcell/koab026 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Cross  FR. Tying down loose ends in the Chlamydomonas genome: functional significance of abundant upstream open Reading frames. G3 (Bethesda). 2016:6(2):435–446. 10.1534/g3.115.023119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Crozet  P, Navarro  FJ, Willmund  F, Mehrshahi  P, Bakowski  K, Lauersen  KJ, Perez-Perez  M-E, Auroy  P, Gorchs Rovira  A, Sauret-Gueto  S, et al.  Birth of a photosynthetic chassis: a MoClo toolkit enabling synthetic biology in the microalga Chlamydomonas reinhardtii. ACS Synth Biol. 2018:7(9):2074–2086. 10.1021/acssynbio.8b00251 [DOI] [PubMed] [Google Scholar]
  47. Dale  AL, Feau  N, Everhart  SE, Dhillon  B, Wong  B, Sheppard  J, Bilodeau  GJ, Brar  A, Tabima  JF, Shen  D, et al.  Mitotic recombination and rapid genome evolution in the invasive forest pathogen Phytophthora ramorum. mBio. 2019:10(2):e02452–e02418. 10.1128/mBio.02452-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Danecek  P, Bonfield  JK, Liddle  J, Marshall  J, Ohan  V, Pollard  MO, Whitwham  A, Keane  T, McCarthy  SA, Davies  RM, et al.  Twelve years of SAMtools and BCFtools. Gigascience. 2021:10(2):giab008. 10.1093/gigascience/giab008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Darienko  T, Pröschold  T. Genetic variability and taxonomic revision of the genus Auxenochlorella (shihira et krauss) kalina et puncocharova (trebouxiophyceae, chlorophyta). J Phycol. 2015:51(2):394–400. 10.1111/jpy.12279 [DOI] [PubMed] [Google Scholar]
  50. da Roza  PA, Muller  H, Sullivan  GJ, Walker  RSK, Goold  HD, Willows  RD, Palenik  B, Paulsen  IT. Chromosome-scale assembly of the streamlined picoeukaryote Picochlorum sp. SENEW3 genome reveals rabl-like chromatin structure and potential for C(4) photosynthesis. Microb Genom. 2024:10(4):001223. 10.1099/mgen.0.001223 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Del Cortona  A, Jackson  CJ, Bucchini  F, Van Bel  M, D’Hondt  S, Skaloud  P, Delwiche  CF, Knoll  AH, Raven  JA, Verbruggen  H, et al.  Neoproterozoic origin and multiple transitions to macroscopic growth in green seaweeds. Proc Natl Acad Sci U S A. 2020:117(5):2551–2559. 10.1073/pnas.1910060117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Diner  RE, Nodding  CM, Lian  NC, Kang  AK, McQuaid  JB, Jablanovic  J, Espinoza  JL, Nguyen  NA, Anzelmatti  MA, Jansson  J, et al.  Diatom centromeres suggest a mechanism for nuclear DNA acquisition. Proc Natl Acad Sci U S A. 2017:114(29):E6015–E6024. 10.1073/pnas.1700764114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Dobin  A, Davis  CA, Schlesinger  F, Drenkow  J, Zaleski  C, Jha  S, Batut  P, Chaisson  M, Gingeras  TR. STAR: ultrafast universal RNA-Seq aligner. Bioinformatics. 2013:29(1):15–21. 10.1093/bioinformatics/bts635 [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Drillon  G, Carbone  A, Fischer  G. SynChro: a fast and easy tool to reconstruct and visualize synteny blocks along eukaryotic chromosomes. PLoS One. 2014:9(3):e92621. 10.1371/journal.pone.0092621 [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Dueñas  MA, Craig  RJ, Gallaher  SD, Moseley  JL, Merchant  SS. Leaky ribosomal scanning enables tunable translation of bicistronic ORFs in green algae. Proc Natl Acad Sci U S A. 2025a:122(9):e2417695122. 10.1073/pnas.2417695122 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Dueñas  MA, Lin  Y-T, Moseley  JL, Merchant  SS. 2025b. Transformation protocol for targeted homologous recombination in Auxenochlorella protothecoides, 2025b doi.org/10.17504/protocols.io.x54v922mql3e/v1. Protocols.io
  57. Dutta  A, Dutreux  F, Schacherer  J. Loss of heterozygosity results in rapid but variable genome homogenization across yeast genetic backgrounds. Elife. 2021:10:e70339. 10.7554/eLife.70339 [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Dutta  A, Schacherer  J. The dynamics of loss of heterozygosity events in genomes. EMBO Rep. 2025:26(3):602–612. 10.1038/s44319-024-00353-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Emms  DM, Kelly  S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019:20(1):238. 10.1186/s13059-019-1832-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Fedry  J, Liu  Y, Pehau-Arnaudet  G, Pei  J, Li  W, Tortorici  MA, Traincard  F, Meola  A, Bricogne  G, Grishin  NV, et al.  The ancient gamete fusogen HAP2 is a eukaryotic class II fusion protein. Cell. 2017:168(5):904–915.e10. 10.1016/j.cell.2017.01.024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Feng  L, Mundy  JEA, Stevenson  CEM, Mitchenall  LA, Lawson  DM, Mi  K, Maxwell  A. The pentapeptide-repeat protein, MfpA, interacts with mycobacterial DNA gyrase as a DNA T-segment mimic. Proc Natl Acad Sci U S A. 2021:118(11):e2016705118. 10.1073/pnas.2016705118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Ferenczi  A, Pyott  DE, Xipnitou  A, Molnar  A. Efficient targeted DNA editing and replacement in Chlamydomonas reinhardtii using Cpf1 ribonucleoproteins and single-stranded DNA. Proc Natl Acad Sci U S A. 2017:114(51):13567–13572. 10.1073/pnas.1710597114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Figueroa-Martinez  F, Nedelcu  AM, Smith  DR, Reyes-Prieto  A. When the lights go out: the evolutionary fate of free-living colorless green algae. New Phytol.  2015:206(3):972–982. 10.1111/nph.13279 [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Fischer  G, Liti  G, Llorente  B. The budding yeast life cycle: more complex than anticipated?  Yeast. 2021:38(1):5–11. 10.1002/yea.3533 [DOI] [PubMed] [Google Scholar]
  65. Flynn  JM, Hubley  R, Goubert  C, Rosen  J, Clark  AG, Feschotte  C, Smit  AF. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci U S A. 2020:117(17):9451–9457. 10.1073/pnas.1921046117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Foflonker  F, Mollegard  D, Ong  M, Yoon  HS, Bhattacharya  D. Genomic analysis of Picochlorum species reveals how microalgae may adapt to variable environments. Mol Biol Evol. 2018:35(11):2702–2711. 10.1093/molbev/msy167 [DOI] [PubMed] [Google Scholar]
  67. Franklin  S, Somanchi  A, Espina  K, Rudenko  G, Chua  P. Renewable chemical production from novel fatty acid feedstocks. United States US7883882B2, filed November 30, 2009, and issued February 9, 2011.
  68. Franklin  S, Somanchi  A, Wee  J, Rudenko  G, Moseley  J, Rakitsky  W. Tailored oils produced from recombinant heterotrophic microorganisms, 2013. United States US8592188B2, filed May 27, 2011, and issued November 26, 2013.
  69. Fu  Y, Luo  G-Z, Chen  K, Deng  X, Yu  M, Han  D, Hao  Z, Liu  J, Lu  X, Dore  LC, et al.  N6-methyldeoxyadenosine marks active transcription start sites in Chlamydomonas. Cell. 2015:161(4):879–892. 10.1016/j.cell.2015.04.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Fučiková  K, Pažoutová  M, Rindi  F. Meiotic genes and sexual reproduction in the green algal class Trebouxiophyceae (Chlorophyta). J Phycol. 2015:51(3):419–430. 10.1111/jpy.12293 [DOI] [PubMed] [Google Scholar]
  71. Fujiwara  T, Ohnuma  M, Yoshida  M, Kuroiwa  T, Hirano  T. Gene targeting in the red alga Cyanidioschyzon merolae: single- and multi-copy insertion using authentic and chimeric selection markers. PLoS One. 2013:8(9):e73608. 10.1371/journal.pone.0073608 [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Fulnečková  J, Hasíková  T, Fajkus  J, Lukešová  A, Eliáš  M, Sýkorová  E. Dynamic evolution of telomeric sequences in the green algal order Chlamydomonadales. Genome Biol Evo. 2012:4(3):248–264. 10.1093/gbe/evs007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Gabaldón  T. Hybridization and the origin of new yeast lineages. FEMS Yeast Res. 2020:20(5):foaa040. 10.1093/femsyr/foaa040 [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Gallaher  SD, Craig  RJ, Ganesan  I, Purvine  SO, McCorkle  SR, Grimwood  J, Strenkert  D, Davidi  L, Roth  MS, Jeffers  TL, et al.  Widespread polycistronic gene expression in green algae. Proc Natl Acad Sci U S A. 2021:118(7):e2017714118. 10.1073/pnas.2017714118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Gallone  B, Steensels  J, Mertens  S, Dzialo  MC, Gordon  JL, Wauters  R, Thesseling  FA, Bellinazzo  F, Saels  V, Herrera-Malaver  B, et al.  Interspecific hybridization facilitates niche adaptation in beer yeast. Nat Ecol Evol. 2019:3(11):1562–1575. 10.1038/s41559-019-0997-9 [DOI] [PubMed] [Google Scholar]
  76. Gao  C, Wang  Y, Shen  Y, Yan  D, He  X, Dai  J, Wu  Q. Oil accumulation mechanisms of the oleaginous microalga Chlorella protothecoides revealed through its genome, transcriptomes, and proteomes. BMC Genomics. 2014:15(1):582. 10.1186/1471-2164-15-582 [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Garagna  S, Page  J, Fernandez-Donoso  R, Zuccotti  M, Searle  JB. The Robertsonian phenomenon in the house mouse: mutation, meiosis and speciation. Chromosoma. 2014:123(6):529–544. 10.1007/s00412-014-0477-6 [DOI] [PubMed] [Google Scholar]
  78. Gazquez  A, Bordenave  CD, Montero-Pau  J, Pérez-Rodrigo  M, Marco  F, Martínez-Alberola  F, Muggia  L, Barreno  E, Carrasco  P. From spores to gametes: a sexual life cycle in a symbiotic Trebouxia microalga. Algal Res.  2024:84:103744. 10.1016/j.algal.2024.103744 [DOI] [Google Scholar]
  79. Goold  HD, Moseley  JL, Lauersen  KJ. The synthetic future of algal genomes. Cell Genom. 2024:4(3):100505. 10.1016/j.xgen.2024.100505 [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Goubert  C, Craig  RJ, Bilat  AF, Peona  V, Vogan  AA, Protasio  AV. A beginner's guide to manual curation of transposable elements. Mob DNA. 2022:13(1):7. 10.1186/s13100-021-00259-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Guo  J, Jian  J, Wang  L, Xiong  L, Lin  H, Zhou  Z, Sonnenschein  EC, Wu  W. Genome sequences of two strains of Prototheca wickerhamii provide insight into the protothecosis evolution. Front Cell Infect Microbiol. 2022:12:797017. 10.3389/fcimb.2022.797017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Harris  EH. Chlamydomonas as a model organism. Annu Rev Plant Physiol Plant Mol Biol. 2001:52(1):363–406. 10.1146/annurev.arplant.52.1.363 [DOI] [PubMed] [Google Scholar]
  83. He  X, Dai  J, Wu  Q. Identification of sporopollenin as the outer layer of cell wall in microalga Chlorella protothecoides. Front Microbiol. 2016:7:1047. 10.3389/fmicb.2016.01047 [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Heimerl  N, Hommel  E, Westermann  M, Meichsner  D, Lohr  M, Hertweck  C, Grossman  AR, Mittag  M, Sasso  S. A giant type I polyketide synthase participates in zygospore maturation in Chlamydomonas reinhardtii. Plant J. 2018:95(2):268–281. 10.1111/tpj.13948 [DOI] [PubMed] [Google Scholar]
  85. Hirakawa  MP, Martinez  DA, Sakthikumar  S, Anderson  MZ, Berlin  A, Gujja  S, Zeng  Q, Zisson  E, Wang  JM, Greenberg  JM, et al.  Genetic and phenotypic intra-species variation in Candida albicans. Genome Res. 2015:25(3):413–425. 10.1101/gr.174623.114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Hirooka  S, Itabashi  T, Ichinose  TM, Onuma  R, Fujiwara  T, Yamashita  S, Jong  LW, Tomita  R, Iwane  AH, Miyagishima  S-y. Life cycle and functional genomics of the unicellular red alga Galdieria for elucidating algal and plant evolution and industrial use. Proc Natl Acad Sci U S A. 2022:119(41):e2210665119. 10.1073/pnas.2210665119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Hisanaga  T, Fujimoto  S, Cui  Y, Sato  K, Sano  R, Yamaoka  S, Kohchi  T, Berger  F, Nakajima  K. Deep evolutionary origin of gamete-directed zygote activation by KNOX/BELL transcription factors in green plants. Elife. 2021:10:e57090. 10.7554/eLife.57090 [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Hobson  DJ, Wei  W, Steinmetz  LM, Svejstrup  JQ. RNA polymerase II collision interrupts convergent transcription. Mol Cell. 2012:48(3):365–374. 10.1016/j.molcel.2012.08.027 [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Hoff  KJ, Lange  S, Lomsadze  A, Borodovsky  M, Stanke  M. BRAKER1: unsupervised RNA-Seq-based genome annotation with GeneMark-ET and AUGUSTUS. Bioinformatics. 2016:32(5):767–769. 10.1093/bioinformatics/btv661 [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Hoff  KJ, Stanke  M. Predicting genes in single genomes with AUGUSTUS. Curr Protoc Bioinformatics. 2019:65(1):e57. 10.1002/cpbi.57 [DOI] [PubMed] [Google Scholar]
  91. Hollingshead  S, Kopecna  J, Jackson  PJ, Canniffe  DP, Davison  PA, Dickman  MJ, Sobotka  R, Hunter  CN. Conserved chloroplast open-reading frame ycf54 is required for activity of the magnesium protoporphyrin monomethylester oxidative cyclase in Synechocystis PCC 6803. J Biol Chem. 2012:287(33):27823–27833. 10.1074/jbc.M112.352526 [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Huff  JT, Zilberman  D. Dnmt1-independent CG methylation contributes to nucleosome positioning in diverse eukaryotes. Cell. 2014:156(6):1286–1297. 10.1016/j.cell.2014.01.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Huss  VAR, Frank  C, Hartmann  EC, Hirmer  M, Kloboucek  A, Seidel  BM, Wenzeler  P, Kessler  E. Biochemical taxonomy and molecular phylogeny of the genus Chlorella sensu lato (Chlorophyta). J Phycol.  1999:35(3):587–598. 10.1046/j.1529-8817.1999.3530587.x [DOI] [Google Scholar]
  94. Huss  VAR, Holweg  C, Seidel  B, Reich  V, Rahat  M, Kessler  E. There is an ecological basis for host symbiont specificity in Chlorella/Hydra symbioses. Endocyt Cell Res. 1994:10:35–46. [Google Scholar]
  95. Jacobebbinghaus  N, Bigge  F, Saudhof  M, Hubner  W, Kruse  O, Baier  T. Transcriptional gene fusions via targeted integration at safe harbors for high transgene expression in Chlamydomonas reinhardtii. New Phytol. 2025:247:2665–2677. 10.1111/nph.70368 [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Jagielski  T, Bakuła  Z, Gawor  J, Maciszewski  K, Kusber  W-H, Dylag  M, Nowakowska  J, Gromadka  R, Karnkowska  A. The genus Prototheca (Trebouxiophyceae, Chlorophyta) revisited: implications from molecular taxonomic studies. Algal Res. 2019:43:101639. 10.1016/j.algal.2019.101639 [DOI] [Google Scholar]
  97. Joo  S, Nishimura  Y, Cronmiller  E, Hong  RH, Kariyawasam  T, Wang  MH, Shao  NC, El Akkad  S-ED, Suzuki  T, Higashiyama  T, et al.  Gene regulatory networks for the haploid-to-diploid transition of Chlamydomonas reinhardtii. Plant Physiol. 2017:175(1):314–332. 10.1104/pp.17.00731 [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Joo  S, Wang  MH, Lui  G, Lee  J, Barnas  A, Kim  E, Sudek  S, Worden  AZ, Lee  J-H. Common ancestry of heterodimerizing TALE homeobox transcription factors across Metazoa and Archaeplastida. BMC Biol. 2018:16(1):136. 10.1186/s12915-018-0605-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Kalina  T, Punčochárová  M. Taxonomy of the subfamily Scotiellocystoideae Fott 1976 (Chlorellaceae, Chlorophyceae). Algol Stud. 1987:45:473–521. [Google Scholar]
  100. Kanesaki  Y, Imamura  S, Matsuzaki  M, Tanaka  K. Identification of centromere regions in chromosomes of a unicellular red alga, Cyanidioschyzon merolae. FEBS Lett.  2015:589(11):1219–1224. 10.1016/j.febslet.2015.04.009 [DOI] [PubMed] [Google Scholar]
  101. Karpowicz  SJ, Prochnik  SE, Grossman  AR, Merchant  SS. The GreenCut2 resource, a phylogenomically derived inventory of proteins specific to the plant lineage. J Biol Chem. 2011:286(24):21427–21439. 10.1074/jbc.M111.233734 [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Katoh  K, Standley  DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013:30(4):772–780. 10.1093/molbev/mst010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Khan  H, Ochi  T. Plant PAXX has an XLF-like function and stimulates DNA end joining by the Ku-DNA ligase IV/XRCC4 complex. Plant J. 2023:116(1):58–68. 10.1111/tpj.16359 [DOI] [PubMed] [Google Scholar]
  104. Kieselbach  T, Mant  A, Robinson  C, Schroder  WP. Characterisation of an Arabidopsis cDNA encoding a thylakoid lumen protein related to a novel ‘pentapeptide repeat’ family of proteins. FEBS Lett. 1998:428(3):241–244. 10.1016/S0014-5793(98)00517-1 [DOI] [PubMed] [Google Scholar]
  105. Kim  SS, Grienenberger  E, Lallemand  B, Colpitts  CC, Kim  SY, Souza Cde  A, Geoffroy  P, Heintz  D, Krahn  D, Kaiser  M, et al.  LAP6/POLYKETIDE SYNTHASE A and LAP5/POLYKETIDE SYNTHASE B encode hydroxyalkyl alpha-pyrone synthases required for pollen development and sporopollenin biosynthesis in Arabidopsis thaliana. Plant Cell. 2011:22(12):4045–4066. 10.1105/tpc.110.080028 [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Kindle  KL. High-frequency nuclear transformation of Chlamydomonas reinhardtii. Proc Natl Acad Sci U S A. 1990:87(3):1228–1232. 10.1073/pnas.87.3.1228 [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Kindle  KL, Schnell  RA, Fernandez  E, Lefebvre  PA. Stable nuclear transformation of Chlamydomonas using the Chlamydomonas gene for nitrate reductase. J Cell Biol. 1989:109(6):2589–2601. 10.1083/jcb.109.6.2589 [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Kipreos  ET, Pagano  M. The F-box protein family. Genome Biol. 2000:1(5):reviews3002.1. 10.1186/gb-2000-1-5-reviews3002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Kolesiński  P, Golik  P, Grudnik  P, Piechota  J, Markiewicz  M, Tarnawski  M, Dubin  G, Szczepaniak  A. Insights into eukaryotic Rubisco assembly—crystal structures of RbcX chaperones from Arabidopsis thaliana. Biochim Biophys Acta. 2013:1830(4):2899–2906. 10.1016/j.bbagen.2012.12.025 [DOI] [PubMed] [Google Scholar]
  110. Kolesiński  P, Piechota  J, Szczepaniak  A. Initial characteristics of RbcX proteins from Arabidopsis thaliana. Plant Mol Biol. 2011:77(4–5):447–459. 10.1007/s11103-011-9823-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  111. Krasovec  M, Merret  R, Sanchez  F, Sanchez-Brosseau  S, Piganeau  G. A high frequency of chromosomal duplications in unicellular algae is compensated by translational regulation. Genome Biol Evol. 2023:15(6):evad086. 10.1093/gbe/evad086 [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. Krishnan  A, Dahlin  LR, Guarnieri  MT, Weissman  JC, Posewitz  MC. Small cells with big photosynthetic productivities: biotechnological potential of the Picochlorum genus. Trends Biotechnol. 2025:43(4):759–772. 10.1016/j.tibtech.2024.10.004 [DOI] [PubMed] [Google Scholar]
  113. Kropat  J, Hong-Hermesdorf  A, Casero  D, Ent  P, Castruita  M, Pellegrini  M, Merchant  SS, Malasarn  D. A revised mineral nutrient supplement increases biomass and growth rate in Chlamydomonas reinhardtii. Plant J. 2011:66(5):770–780. 10.1111/j.1365-313X.2011.04537.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Krüger  W. Beiträge zur kenntnis der organismen des saftflusses (sog. Schleimflusses) der laubbäume. II. Über zwei aus saftflüssen rein gezüchtete algen. Zopf Beitr Physiol Morphol Organ. 1894a:4:91–116. [Google Scholar]
  115. Krüger  W. Kurze charakteristik einiger niederer organismen im saftflusse der laubbäume. Hedwigia. 1894b:33:241–266. [Google Scholar]
  116. Kwon  T, Hanschen  ER, Hovde  BT. Addressing the pervasive scarcity of structural annotation in eukaryotic algae. Sci Rep. 2023:13(1):1687. 10.1038/s41598-023-27881-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  117. Lancaster  SM, Payen  C, Smukowski Heil  C, Dunham  MJ. Fitness benefits of loss of heterozygosity in Saccharomyces hybrids. Genome Res. 2019:29(10):1685–1692. 10.1101/gr.245605.118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  118. Langdon  QK, Peris  D, Baker  EP, Opulente  DA, Nguyen  H-V, Bond  U, Goncalves  P, Sampaio  JP, Libkind  D, Hittinger  CT. Fermentation innovation through complex hybridization of wild and domesticated yeasts. Nat Ecol Evol. 2019:3(11):1576–1586. 10.1038/s41559-019-0998-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  119. Lee  J-H, Lin  H, Joo  S, Goodenough  U. Early sexual origins of homeoprotein heterodimerization and evolution of the plant KNOX/BELL family. Cell. 2008:133(5):829–840. 10.1016/j.cell.2008.04.028 [DOI] [PubMed] [Google Scholar]
  120. Lemieux  C, Turmel  M, Otis  C, Pombert  J-F. A streamlined and predominantly diploid genome in the tiny marine green alga Chloropicon primus. Nat Commun. 2019:10(1):4061. 10.1038/s41467-019-12014-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  121. Li  H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv 3997. 10.48550/arXiv.1303.3997, 16 March 2013, preprint: not peer reviewed. [DOI]
  122. Li  H. New strategies to improve minimap2 alignment accuracy. Bioinformatics. 2021:37(23):4572–4574. 10.1093/bioinformatics/btab705 [DOI] [PMC free article] [PubMed] [Google Scholar]
  123. Li  X, Patena  W, Fauser  F, Jinkerson  RE, Saroussi  S, Meyer  MT, Ivanova  N, Robertson  JM, Yue  R, Zhang  R, et al.  A genome-wide algal mutant library and functional screen identifies genes required for eukaryotic photosynthesis. Nat Genet. 2019:51(4):627–635. 10.1038/s41588-019-0370-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  124. Li  X, Zhang  R, Patena  W, Gang  SS, Blum  SR, Ivanova  N, Yue  R, Robertson  JM, Lefebvre  PA, Fitz-Gibbon  ST, et al.  An indexed, mapped mutant library enables reverse genetics studies of biological processes in Chlamydomonas reinhardtii. Plant Cell. 2016:28(2):367–387. 10.1105/tpc.15.00465 [DOI] [PMC free article] [PubMed] [Google Scholar]
  125. Lin  J, Le  TV, Augspurger  K, Tritschler  D, Bower  R, Fu  G, Perrone  C, O’Toole  ET, Mills  KV, Dymek  E, et al.  FAP57/WDR65 targets assembly of a subset of inner arm dyneins and connects to regulatory hubs in cilia. Mol Biol Cell. 2019:30(21):2659–2680. 10.1091/mbc.E19-07-0367 [DOI] [PMC free article] [PubMed] [Google Scholar]
  126. Lin  W-R, Ng  I-S. Development of CRISPR/cas9 system in Chlorella vulgaris FSP-E to enhance lipid accumulation. Enzyme Microb Technol. 2020:133(February):109458. 10.1016/j.enzmictec.2019.109458. [DOI] [PubMed] [Google Scholar]
  127. Liu  C, Young  AL, Starling-Windhof  A, Bracher  A, Saschenbrecker  S, Rao  BV, Rao  KV, Berninghausen  O, Mielke  T, Hartl  FU, et al.  Coupled chaperone action in folding and assembly of hexadecameric rubisco. Nature. 2010:463(7278):197–202. 10.1038/nature08651 [DOI] [PubMed] [Google Scholar]
  128. Liu  HW, Khera  R, Grob  P, Gallaher  SD, Purvine  SO, Nicora  CD, Lipton  MS, Niyogi  KK, Nogales  E, Iwai  M, et al.  A distinct LHCI arrangement is recruited to photosystem I in Fe-starved green algae. Proc Natl Acad Sci U S A. 2025:122(25):e2500621122. 10.1073/pnas.2500621122 [DOI] [PMC free article] [PubMed] [Google Scholar]
  129. Loera-Quezada  MM, Leyva-Gonzalez  MA, Velazquez-Juarez  G, Sanchez-Calderon  L, Do Nascimento  M, Lopez-Arredondo  D, Herrera-Estrella  L. A novel genetic engineering platform for the effective management of biological contaminants for the production of microalgae. Plant Biotechnol J. 2016:14(10):2066–2076. 10.1111/pbi.12564 [DOI] [PMC free article] [PubMed] [Google Scholar]
  130. Lopez  D, Hamaji  T, Kropat  J, De Hoff  P, Morselli  M, Rubbi  L, Fitz-Gibbon  S, Gallaher  SD, Merchant  SS, Umen  J, et al.  Dynamic changes in the transcriptome and methylome of Chlamydomonas reinhardtii throughout its life cycle. Plant Physiol. 2015:169(4):2730–2743. 10.1104/pp.15.00861 [DOI] [PMC free article] [PubMed] [Google Scholar]
  131. Lopez-Cortegano  E, Craig  RJ, Chebib  J, Balogun  EJ, Keightley  PD. Rates and spectra of de novo structural mutations in Chlamydomonas reinhardtii. Genome Res. 2023:33(1):45–60. 10.1101/gr.276957.122 [DOI] [PMC free article] [PubMed] [Google Scholar]
  132. Lozano  JC, Schatt  P, Botebol  H, Verge  V, Lesuisse  E, Blain  S, Carre  IA, Bouget  FY. Efficient gene targeting and removal of foreign DNA by homologous recombination in the picoeukaryote Ostreococcus. Plant J. 2014:78(6):1073–1083. 10.1111/tpj.12530 [DOI] [PubMed] [Google Scholar]
  133. MacIntosh  GC. RNase T2 family: enzymatic properties, functional diversity, and evolution of ancient ribonucleases. In: Nicholson  A, editors. Ribonucleases. Nucleic acids and molecular biology. Berlin: Springer; 2011. p. 89–114. [Google Scholar]
  134. Mancera  E, Bourgon  R, Brozzi  A, Huber  W, Steinmetz  LM. High-resolution mapping of meiotic crossovers and non-crossovers in yeast. Nature. 2008:454(7203):479–485. 10.1038/nature07135 [DOI] [PMC free article] [PubMed] [Google Scholar]
  135. Manni  M, Berkeley  MR, Seppey  M, Simao  FA, Zdobnov  EM. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol. 2021:38(10):4647–4654. 10.1093/molbev/msab199 [DOI] [PMC free article] [PubMed] [Google Scholar]
  136. Marcet-Houben  M, Gabaldón  T. Beyond the whole-genome duplication: phylogenetic evidence for an ancient interspecies hybridization in the baker's yeast lineage. PLoS Biol. 2015:13(8):e1002220. 10.1371/journal.pbio.1002220 [DOI] [PMC free article] [PubMed] [Google Scholar]
  137. Marcolungo  L, Bellamoli  F, Cecchin  M, Lopatriello  G, Rossato  M, Cosentino  E, Rombauts  S, Delledonne  M, Ballottari  M. Haematococcus lacustris genome assembly and annotation reveal diploid genetic traits and stress-induced gene expression patterns. Algal Res. 2024:80(June):103567. 10.1016/j.algal.2024.103567 [DOI] [PMC free article] [PubMed] [Google Scholar]
  138. Maruyama  S, Sugahara  J, Kanai  A, Nozaki  H. Permuted tRNA genes in the nuclear and nucleomorph genomes of photosynthetic eukaryotes. Mol Biol Evol. 2010:27(5):1070–1076. 10.1093/molbev/msp313 [DOI] [PubMed] [Google Scholar]
  139. Matsuka  M, Otsuka  H, Hase  E. Changes in contents of carbohydrate and fatty acid in cells of Chlorella protothecoides during processes of de- and re-generation of chloroplasts. Plant Cell Physiol. 1966:7(4):651–662. 10.1093/oxfordjournals.pcp.a079217 [DOI] [Google Scholar]
  140. Mattei  AL, Bailly  N, Meissner  A. DNA methylation: a historical perspective. Trends Genet. 2022:38(7):676–707. 10.1016/j.tig.2022.03.010 [DOI] [PubMed] [Google Scholar]
  141. Merchant  SS, Prochnik  SE, Vallon  O, Harris  EH, Karpowicz  SJ, Witman  GB, Terry  A, Salamov  A, Fritz-Laylin  LK, Marechal-Drouard  L, et al.  The Chlamydomonas genome reveals the evolution of key animal and plant functions. Science. 2007:318(5848):245–250. 10.1126/science.1143609 [DOI] [PMC free article] [PubMed] [Google Scholar]
  142. Minh  BQ, Schmidt  HA, Chernomor  O, Schrempf  D, Woodhams  MD, von Haeseler  A, Lanfear  R. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol. 2020:37(5):1530–1534. 10.1093/molbev/msaa015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  143. Minoda  A, Sakagami  R, Yagisawa  F, Kuroiwa  T, Tanaka  K. Improvement of culture conditions and evidence for nuclear transformation by homologous recombination in a red alga, Cyanidioschyzon merolae 10D. Plant Cell Physiol. 2004:45(6):667–671. 10.1093/pcp/pch087 [DOI] [PubMed] [Google Scholar]
  144. Mock  H-P. Photosensitizing tetrapyrroles induce antioxidative and pathogen defense responses in plants. In: Inze  D, Van Montagu  M, editors. Oxidative stress in plants. London.: CRC Press; 2001. p. 207–226. [Google Scholar]
  145. Moore  ER, Bullington  BS, Weisberg  AJ, Jiang  Y, Chang  J, Halsey  KH. Morphological and transcriptomic evidence for ammonium induction of sexual reproduction in Thalassiosira pseudonana and other centric diatoms. PLoS One. 2017:12(7):e0181098. 10.1371/journal.pone.0181098 [DOI] [PMC free article] [PubMed] [Google Scholar]
  146. Moseley  JL, Lee  B, Im  CS, Kim  J, Kim  D, Bhat  R. Production of lipids and terpenoids in Auxenochlorella protothecoides. United States US12037630B2, filed November 5, 2021, and issued July 16, 2024.
  147. Neupert  J, Gallaher  SD, Lu  Y, Strenkert  D, Segal  N, Barahimipour  R, Fitz-Gibbon  ST, Schroda  M, Merchant  SS, Bock  R. An epigenetic gene silencing pathway selectively acting on transgenic DNA in the green alga Chlamydomonas. Nat Commun. 2020:11(1):6269. 10.1038/s41467-020-19983-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  148. Nievergelt  AP. Genome editing in the green alga Chlamydomonas: past, present practice and future prospects. Plant J. 2025:122(1):e70140. 10.1111/tpj.70140 [DOI] [PMC free article] [PubMed] [Google Scholar]
  149. Nievergelt  AP, Diener  DR, Bogdanova  A, Brown  T, Pigino  G. Efficient precision editing of endogenous Chlamydomonas reinhardtii genes with CRISPR-Cas. Cell Rep Methods. 2023:3(8):100562. 10.1016/j.crmeth.2023.100562 [DOI] [PMC free article] [PubMed] [Google Scholar]
  150. Ning  J, Otto  TD, Pfander  C, Schwach  F, Brochet  M, Bushell  E, Goulding  D, Sanders  M, Lefebvre  PA, Pei  J, et al.  Comparative genomics in Chlamydomonas and Plasmodium identifies an ancient nuclear envelope protein family essential for sexual reproduction in protists, fungi, plants, and vertebrates. Genes Dev. 2013:27(10):1198–1215. 10.1101/gad.212746.112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  151. Orr-Weaver  TL, Szostak  JW, Rothstein  RJ. Yeast transformation: a model system for the study of recombination. Proc Natl Acad Sci U S A. 1981:78(10):6354–6358. 10.1073/pnas.78.10.6354 [DOI] [PMC free article] [PubMed] [Google Scholar]
  152. Pardo-Palacios  FJ, Arzalluz-Luque  A, Kondratova  L, Salguero  P, Mestre-Tomas  J, Amorin  R, Estevan-Morio  E, Liu  T, Nanni  A, McIntyre  L, et al.  SQANTI3: curation of long-read transcriptomes for accurate identification of known and novel isoforms. Nat Methods. 2024:21(5):793–797. 10.1038/s41592-024-02229-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  153. Park  S-H, Kyndt  JA, Brown  JK. Comparison of Auxenochlorella protothecoides and Chlorella spp. Chloroplast genomes: evidence for endosymbiosis and horizontal virus-like gene transfer. Life (Basel). 2022:12(3):458. 10.3390/life12030458 [DOI] [PMC free article] [PubMed] [Google Scholar]
  154. Peris  D, Perez-Torrado  R, Hittinger  CT, Barrio  E, Querol  A. On the origins and industrial applications of Saccharomyces cerevisiae × Saccharomyces kudriavzevii hybrids. Yeast. 2018:35(1):51–69. 10.1002/yea.3283 [DOI] [PubMed] [Google Scholar]
  155. Petroll  R, Papareddy  RK, Krela  R, Laigle  A, Riviere  Q, Bišová  K, Mozgová  I, Borg  M. The expansion and diversification of epigenetic regulatory networks underpins major transitions in the evolution of land plants. Mol Biol Evol. 2025:42(4):msaf064. 10.1093/molbev/msaf064 [DOI] [PMC free article] [PubMed] [Google Scholar]
  156. Pombert  J-F, Blouin  NA, Lane  C, Boucias  D, Keeling  PJ. A lack of parasitic reduction in the obligate parasitic green alga Helicosporidium. PLoS Genet. 2014:10(5):e1004355. 10.1371/journal.pgen.1004355 [DOI] [PMC free article] [PubMed] [Google Scholar]
  157. Pontes  A, Cadez  N, Goncalves  P, Sampaio  JP. A quasi-domesticate relic hybrid population of Saccharomyces cerevisiae × S. paradoxus adapted to olive brine. Front Genet. 2019:10:449. 10.3389/fgene.2019.00449 [DOI] [PMC free article] [PubMed] [Google Scholar]
  158. Poplin  R, Chang  P-C, Alexander  D, Schwartz  S, Colthurst  T, Ku  A, Newburger  D, Dijamco  J, Nguyen  N, Afshar  PT, et al.  A universal SNP and small-indel variant caller using deep neural networks. Nat Biotechnol. 2018:36(10):983–987. 10.1038/nbt.4235 [DOI] [PubMed] [Google Scholar]
  159. Pore  RS. Nutritional basis for relating Prototheca and Chlorella. Can J Microbiol. 1972:18(7):1175–1177. 10.1139/m72-183 [DOI] [PubMed] [Google Scholar]
  160. Prescott  EM, Proudfoot  NJ. Transcriptional collision between convergent genes in budding yeast. Proc Natl Acad Sci U S A. 2002:99(13):8796–8801. 10.1073/pnas.132270899 [DOI] [PMC free article] [PubMed] [Google Scholar]
  161. Pryszcz  LP, Nemeth  T, Saus  E, Ksiezopolska  E, Hegedusova  E, Nosek  J, Wolfe  KH, Gacser  A, Gabaldon  T. The genomic aftermath of hybridization in the opportunistic pathogen Candida metapsilosis. PLoS Genet. 2015:11(10):e1005626. 10.1371/journal.pgen.1005626 [DOI] [PMC free article] [PubMed] [Google Scholar]
  162. Quinlan  AR, Hall  IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010:26(6):841–842. 10.1093/bioinformatics/btq033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  163. Randolph-Anderson  BL, Boynton  JE, Gillham  NW, Harris  EH, Johnson  AM, Dorthu  M-P, Matagne  RF. Further characterization of the respiratory deficient dum-1 mutation of Chlamydomonas reinhardtii and its use as a recipient for mitochondrial transformation. Mol Gen Genet. 1993:236–236(2–3):235–244. 10.1007/BF00277118 [DOI] [PubMed] [Google Scholar]
  164. Robinson  JT, Thorvaldsdóttir  H, Winckler  W, Guttman  M, Lander  ES, Getz  G, Mesirov  JP. Integrative genomics viewer. Nat Biotechnol. 2011:29(1):24–26. 10.1038/nbt.1754 [DOI] [PMC free article] [PubMed] [Google Scholar]
  165. Romero Charria  P, Navarrete  C, Ovchinnikov  V, Sarre  LA, Shabardina  V, Casacuberta  E, Lara-Astiaso  D, Sebé-Pedrós  A, de Mendoza  A. Adenine DNA methylation associated to transcription is widespread across eukaryotes. bioRxiv 620566. 10.1101/2024.10.28.620566, 28 October 2024, preprint: not peer reviewed. [DOI] [PMC free article] [PubMed]
  166. Saito  M, Xu  P, Faure  G, Maguire  S, Kannan  S, Altae-Tran  H, Vo  S, Desimone  A, Macrae  RK, Zhang  F. Fanzor is a eukaryotic programmable RNA-guided endonuclease. Nature. 2023:620(7974):660–668. 10.1038/s41586-023-06356-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  167. Salome  PA, Merchant  SS. A series of fortunate events: introducing Chlamydomonas as a reference organism. Plant Cell. 2019:31(8):1682–1707. 10.1105/tpc.18.00952 [DOI] [PMC free article] [PubMed] [Google Scholar]
  168. Sanders  CK, Hanschen  ER, Biondi  TC, Hovde  BT, Kunde  YA, Eng  WL, Kwon  T, Dale  T. Phylogenetic analyses and reclassification of the oleaginous marine species Nannochloris sp. “desiccata” (Trebouxiophyceae, Chlorophyta), formerly Chlorella desiccata, supported by a high-quality genome assembly. J Phycol. 2022:58(3):436–448. 10.1111/jpy.13242 [DOI] [PubMed] [Google Scholar]
  169. Santaguida  S, Amon  A. Short- and long-term effects of chromosome mis-segregation and aneuploidy. Nat Rev Mol Cell Biol. 2015:16(8):473–485. 10.1038/nrm4025 [DOI] [PubMed] [Google Scholar]
  170. Saschenbrecker  S, Bracher  A, Rao  KV, Rao  BV, Hartl  FU, Hayer-Hartl  M. Structure and function of RbcX, an assembly chaperone for hexadecameric Rubisco. Cell. 2007:129(6):1189–1200. 10.1016/j.cell.2007.04.025 [DOI] [PubMed] [Google Scholar]
  171. Sasso  S, Stibor  H, Mittag  M, Grossman  AR. From molecular manipulation of domesticated Chlamydomonas reinhardtii to survival in nature. Elife. 2018:7:e39233. 10.7554/eLife.39233 [DOI] [PMC free article] [PubMed] [Google Scholar]
  172. Schroda  M. Good news for nuclear transgene expression in Chlamydomonas. Cells. 2019:8(12):1534. 10.3390/cells8121534 [DOI] [PMC free article] [PubMed] [Google Scholar]
  173. Schroda  M, Remacle  C. Molecular advancements establishing Chlamydomonas as a host for biotechnological exploitation. Front Plant Sci. 2022:13:911483. 10.3389/fpls.2022.911483 [DOI] [PMC free article] [PubMed] [Google Scholar]
  174. Schroder  MS, Martinez de San Vicente  K, Prandini  TH, Hammel  S, Higgins  DG, Bagagli  E, Wolfe  KH, Butler  G. Multiple origins of the pathogenic yeast Candida orthopsilosis by separate hybridizations between two parental species. PLoS Genet. 2016:12(11):e1006404. 10.1371/journal.pgen.1006404 [DOI] [PMC free article] [PubMed] [Google Scholar]
  175. Schubert  M, Petersson  UA, Haas  BJ, Funk  C, Schroder  WP, Kieselbach  T. Proteome map of the chloroplast lumen of Arabidopsis thaliana. J Biol Chem. 2002:277(10):8354–8365. 10.1074/jbc.M108575200 [DOI] [PubMed] [Google Scholar]
  176. Selmecki  A, Forche  A, Berman  J. Aneuploidy and isochromosome formation in drug-resistant Candida albicans. Science. 2006:313(5785):367–370. 10.1126/science.1128242 [DOI] [PMC free article] [PubMed] [Google Scholar]
  177. Seto  A, Wang  HL, Hesseltine  CW. Culture conditions affect eicosapentaenoic acid content of Chlorella minutissima. J Am Oil Chem Soc. 1984:61(5):892–894. 10.1007/BF02542159 [DOI] [Google Scholar]
  178. Severgnini  M, Lazzari  B, Capra  E, Chessa  S, Luini  M, Bordoni  R, Castiglioni  B, Ricchi  M, Cremonesi  P. Genome sequencing of Prototheca zopfii genotypes 1 and 2 provides evidence of a severe reduction in organellar genomes. Sci Rep. 2018:8(1):14637. 10.1038/s41598-018-32992-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  179. Shihira-Ishikawa  I, Hase  E. Nutritional control of cell pigmentation in Chlorella protothecoides with special reference to the degeneration of chloroplast induced by glucose. Plant Cell Physiol. 1964:5(2):227–240. 10.1093/oxfordjournals.pcp.a079037 [DOI] [Google Scholar]
  180. Shimogawara  K, Fujiwara  S, Grossman  A, Usuda  H. High-efficiency transformation of Chlamydomonas reinhardtii by electroporation. Genetics. 1998:148(4):1821–1828. 10.1093/genetics/148.4.1821 [DOI] [PMC free article] [PubMed] [Google Scholar]
  181. Sipiczki  M. Interspecies hybridization and recombination in Saccharomyces wine yeasts. FEMS Yeast Res. 2008:8(7):996–1007. 10.1111/j.1567-1364.2008.00369.x [DOI] [PubMed] [Google Scholar]
  182. Smukowski Heil  CS, DeSevo  CG, Pai  DA, Tucker  CM, Hoang  ML, Dunham  MM. Loss of heterozygosity drives adaptation in hybrid yeast. Mol Biol Evol. 2017:34(7):1596–1612. 10.1093/molbev/msx098 [DOI] [PMC free article] [PubMed] [Google Scholar]
  183. Soltis  PS, Soltis  DE. The role of hybridization in plant speciation. Annu Rev Plant Biol. 2009:60(1):561–588. 10.1146/annurev.arplant.043008.092039 [DOI] [PubMed] [Google Scholar]
  184. Soma  A, Onodera  A, Sugahara  J, Kanai  A, Yachie  N, Tomita  M, Kawamura  F, Sekine  Y. Permuted tRNA genes expressed via a circular RNA intermediate in Cyanidioschyzon merolae. Science. 2007:318(5849):450–453. 10.1126/science.1145718 [DOI] [PubMed] [Google Scholar]
  185. Sproles  AE, Fields  FJ, Smalley  TN, Le  CH, Badary  A, Mayfield  SP. Recent advancements in the genetic engineering of microalgae. Algal Res.  2021:53:102158. 10.1016/j.algal.2020.102158 [DOI] [Google Scholar]
  186. Stadler  R, Wolf  K, Hilgarth  C, Tanner  W, Sauer  N. Subcellular localization of the inducible Chlorella HUP1 monosaccharide-H+ symporter and cloning of a co-induced galactose-H+ symporter. Plant Physiol. 1995:107(1):33–41. 10.1104/pp.107.1.33 [DOI] [PMC free article] [PubMed] [Google Scholar]
  187. Stanke  M, Diekhans  M, Baertsch  R, Haussler  D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics. 2008:24(5):637–644. 10.1093/bioinformatics/btn013 [DOI] [PubMed] [Google Scholar]
  188. Steenwyk  JL, Lind  AL, Ries  LNA, Dos Reis  TF, Silva  LP, Almeida  F, Bastos  RW, Fraga da Silva  TFC, Bonato  VLD, Pessoni  AM, et al.  Pathogenic allodiploid hybrids of Aspergillus fungi. Curr Biol. 2020:30(13):2495–2507.e7. 10.1016/j.cub.2020.04.071 [DOI] [PMC free article] [PubMed] [Google Scholar]
  189. Strenkert  D, Schmollinger  S, Gallaher  SD, Salome  PA, Purvine  SO, Nicora  CD, Mettler-Altmann  T, Soubeyrand  E, Weber  APM, Lipton  MS, et al.  Multiomics resolution of molecular events during a day in the life of Chlamydomonas. Proc Natl Acad Sci U S A. 2019:116(6):2374–2383. 10.1073/pnas.1815238116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  190. Strenkert  D, Yildirim  A, Yan  J, Yoshinaga  Y, Pellegrini  M, O’Malley  RC, Merchant  SS, Umen  JG. The landscape of Chlamydomonas histone H3 lysine 4 methylation reveals both constant features and dynamic changes during the diurnal cycle. Plant J. 2022:112(2):352–368. 10.1111/tpj.15948 [DOI] [PMC free article] [PubMed] [Google Scholar]
  191. Sui  Y, Qi  L, Wu  J-K, Wen  X-P, Tang  X-X, Ma  Z-J, Wu  X-C, Zhang  K, Kokoska  RJ, Zheng  D-Q, et al.  Genome-wide mapping of spontaneous genetic alterations in diploid yeast cells. Proc Natl Acad Sci U S A. 2020:117(45):28191–28200. 10.1073/pnas.2018633117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  192. Suzuki  S, Endoh  R, Manabe  R-i, Ohkuma  M, Hirakawa  Y. Multiple losses of photosynthesis and convergent reductive genome evolution in the colourless green algae Prototheca. Sci Rep. 2018:8(1):940. 10.1038/s41598-017-18378-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  193. Tamura  K, Stecher  G, Kumar  S. MEGA11: molecular evolutionary genetics analysis version 11. Mol Biol Evol. 2021:38(7):3022–3027. 10.1093/molbev/msab120 [DOI] [PMC free article] [PubMed] [Google Scholar]
  194. Tanaka  S, Sawaya  MR, Kerfeld  CA, Yeates  TO. Structure of the RuBisCO chaperone RbcX from Synechocystis sp. PCC6803. Acta Crystallogr D Biol Crystallogr. 2007:63(10):1109–1112. 10.1107/S090744490704228X [DOI] [PubMed] [Google Scholar]
  195. Tang  S, Lomsadze  A, Borodovsky  M. Identification of protein coding regions in RNA transcripts. Nucleic Acids Res. 2015:43(12):e78. 10.1093/nar/gkv227 [DOI] [PMC free article] [PubMed] [Google Scholar]
  196. Tillich  M, Lehwark  P, Pellizzer  T, Ulbricht-Jones  ES, Fischer  A, Bock  R, Greiner  S. Geseq—versatile and accurate annotation of organelle genomes. Nucleic Acids Res. 2017:45(W1):W6–W11. 10.1093/nar/gkx391 [DOI] [PMC free article] [PubMed] [Google Scholar]
  197. Torres-Machorro  AL, Hernandez  R, Cevallos  AM, López-Villaseñor  I. Ribosomal RNA genes in eukaryotic microorganisms: witnesses of phylogeny?  FEMS Microbiol Rev. 2010:34(1):59–86. 10.1111/j.1574-6976.2009.00196.x [DOI] [PubMed] [Google Scholar]
  198. Tottey  S, Block  MA, Allen  M, Westergren  T, Albrieux  C, Scheller  HV, Merchant  S, Jensen  PE. Arabidopsis CHL27, located in both envelope and thylakoid membranes, is required for the synthesis of protochlorophyllide. Proc Natl Acad Sci U S A. 2003:100(26):16119–16124. 10.1073/pnas.2136793100 [DOI] [PMC free article] [PubMed] [Google Scholar]
  199. Trincado  JL, Entizne  JC, Hysenaj  G, Singh  B, Skalic  M, Elliott  DJ, Eyras  E. SUPPA2: fast, accurate, and uncertainty-aware differential splicing analysis across multiple conditions. Genome Biol. 2018:19(1):40. 10.1186/s13059-018-1417-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  200. Tsai  IJ, Bensasson  D, Burt  A, Koufopanou  V. Population genomics of the wild yeast Saccharomyces paradoxus: quantifying the life cycle. Proc Natl Acad Sci U S A. 2008:105(12):4957–4962. 10.1073/pnas.0707314105 [DOI] [PMC free article] [PubMed] [Google Scholar]
  201. Ueno  R, Hanagata  N, Urano  N, Suzuki  M. Molecular phylogeny and phenotypic variation in the heterotrophic green algal genus Prototheca (Trebouxiophyceae, Chlorophyta). J Phycol.  2005:41(6):1268–1280. 10.1111/j.1529-8817.2005.00142.x [DOI] [Google Scholar]
  202. Vande Zande  P, Zhou  X, Selmecki  A. The dynamic fungal genome: polyploidy, aneuploidy and copy number variation in response to stress. Annu Rev Microbiol. 2023:77(1):341–361. 10.1146/annurev-micro-041320-112443 [DOI] [PMC free article] [PubMed] [Google Scholar]
  203. Vetting  MW, Hegde  SS, Fajardo  JE, Fiser  A, Roderick  SL, Takiff  HE, Blanchard  JS. Pentapeptide repeat proteins. Biochemistry. 2006:45(1):1–10. 10.1021/bi052130w [DOI] [PMC free article] [PubMed] [Google Scholar]
  204. Vogler  BW, Starkenburg  SR, Sudasinghe  N, Schambach  JY, Rollin  JA, Pattathil  S, Barry  AN. Characterization of plant carbon substrate utilization by Auxenochlorella protothecoides. Algal Res.  2018:34:37–48. 10.1016/j.algal.2018.07.001 [DOI] [Google Scholar]
  205. Wang  B, Jia  Y, Dang  N, Yu  J, Bush  SJ, Gao  S, He  W, Wang  S, Guo  H, Yang  X, et al.  Near telomere-to-telomere genome assemblies of two Chlorella species unveil the composition and evolution of centromeres in green algae. BMC Genomics. 2024:25(1):356. 10.1186/s12864-024-10280-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  206. Werner  A, Kanhere  A, Wahlestedt  C, Mattick  JS. Natural antisense transcripts as versatile regulators of gene expression. Nat Rev Genet. 2024:25(10):730–744. 10.1038/s41576-024-00723-z [DOI] [PubMed] [Google Scholar]
  207. Wolff  G, Burger  G, Lang  BF, Kück  U. Mitochondrial genes in the colourless alga Prototheca wickerhamii resemble plant genes in their exons but fungal genes in their introns. Nucleic Acids Res. 1993:21(3):719–726. 10.1093/nar/21.3.719 [DOI] [PMC free article] [PubMed] [Google Scholar]
  208. Yan  D, Dai  J, Wu  Q. Characterization of an ammonium transporter in the oleaginous alga Chlorella protothecoides. Appl Microbiol Biotechnol. 2013:97(2):919–928. 10.1007/s00253-012-4534-x [DOI] [PubMed] [Google Scholar]
  209. Yan  D, Wang  Y, Murakami  T, Shen  Y, Gong  JH, Jiang  HF, Smith  DR, Pombert  J-F, Dai  JBA, Wu  QY. Auxenochlorella protothecoides and Prototheca wickerhamii plastid genome sequences give insight into the origins of non-photosynthetic algae. Sci Rep. 2015:5(1):14465. 10.1038/srep14465 [DOI] [PMC free article] [PubMed] [Google Scholar]
  210. Yang  B, Liu  J, Jiang  Y, Chen  F. Chlorella species as hosts for genetic engineering and expression of heterologous proteins: progress, challenge and perspective. Biotechnol J. 2016:11(10):1244–1261. 10.1002/biot.201500617 [DOI] [PubMed] [Google Scholar]
  211. Yona  AH, Manor  YS, Herbst  RH, Romano  GH, Mitchell  A, Kupiec  M, Pilpel  Y, Dahan  O. Chromosomal duplication is a transient evolutionary solution to stress. Proc Natl Acad Sci U S A. 2012:109(51):21010–21015. 10.1073/pnas.1211150109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  212. Yoon  PH, Skopintsev  P, Shi  H, Chen  L, Adler  BA, Al-Shimary  M, Craig  RJ, Loi  KJ, DeTurk  EC, Li  Z, et al.  Eukaryotic RNA-guided endonucleases evolved from a unique clade of bacterial enzymes. Nucleic Acids Res. 2023:51(22):12414–12427. 10.1093/nar/gkad1053 [DOI] [PMC free article] [PubMed] [Google Scholar]
  213. Yuen  KW, Warren  CD, Chen  O, Kwok  T, Hieter  P, Spencer  FA. Systematic genome instability screens in yeast and their potential relevance to cancer. Proc Natl Acad Sci U S A. 2007:104(10):3925–3930. 10.1073/pnas.0610642104 [DOI] [PMC free article] [PubMed] [Google Scholar]
  214. Zemach  A, McDaniel  IE, Silva  P, Zilberman  D. Genome-wide evolutionary analysis of eukaryotic DNA methylation. Science. 2010:328(5980):916–919. 10.1126/science.1186366 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

koaf259_Supplementary_Data

Data Availability Statement

All sequencing data associated with Auxenochlorella UTEX 250-A is available from NCBI under the BioSample SAMN45466464. The UTEX 250-A genome and gene annotations are specifically available under the BioProjects PRJNA1195245 (Haplotype A) and PRJNA1195244 (Haplotype B). Haploid genome assemblies and sequencing data for UTEX 25, UTEX 2341, CCAP 211/61 and CCAP 211/7A (sequencing data only) are available from NCBI under the BioProject PRJNA1328465. Alignment and tree files for phylogenetic analysis are provided as Supplementary Files S3 to S14.


Articles from The Plant Cell are provided here courtesy of Oxford University Press

RESOURCES