A conditional temperature-sensitive mutant collection in cell-cycle-essential pathways in the green alga Chlamydomonas reinhardtii was constructed using a multiplexed genetic/bioinformatics method.
Abstract
We generated a large collection of temperature-sensitive lethal mutants in the unicellular green alga Chlamydomonas reinhardtii, focusing on mutations specifically affecting cell cycle regulation. We used UV mutagenesis and robotically assisted phenotypic screening to isolate candidates. To overcome the bottleneck at the critical step of molecular identification of the causative mutation (“driver”), we developed MAPS-SEQ (meiosis-assisted purifying selection sequencing), a multiplexed genetic/bioinformatics strategy. MAPS-SEQ allowed us to perform multiplexed simultaneous determination of the driver mutations from hundreds of neutral “passenger” mutations in each member of a large pool of mutants. This method should work broadly, including in multicellular diploid genetic systems, for any scorable trait. Using MAPS-SEQ, we identified essential genes spanning a wide range of molecular functions. Phenotypic clustering based on DNA content analysis and cell morphology indicated that the mutated genes function in the cell cycle at multiple points and by diverse mechanisms. The collection is sufficiently complete to allow specific conditional inactivation of almost all cell-cycle-regulatory pathways. Approximately seventy-five percent of the essential genes identified in this project had clear orthologs in land plant genomes, a huge enrichment compared with the value of ∼20% for the Chlamydomonas genome overall. Findings about these mutants will likely have direct relevance to essential cell biology in land plants.
INTRODUCTION
The core cell cycle machinery is regulated by internal and external cues to appropriately modulate proliferation in various developmental stages and environmental conditions. Therefore, cell cycle regulation plays a pivotal role in plant development and growth (Inzé and De Veylder, 2006; Francis, 2007). Cell cycle control is well studied in Opisthokonts (animals and fungi) and is deeply conserved within that lineage; however, Opisthokonts diverged from the green algae/land plant lineage early in eukaryotic evolution (Rogozin et al., 2009), leaving open the possibility that much of the Opisthokont model might apply poorly or not at all to the plant kingdom. To the extent that the systems in plants are ultimately shown empirically to be similar or even identical to those in Opisthokonts, that is a very good result because it means that much hard-won knowledge can be transferred more or less wholesale from yeast and animals to plants. Already, though, there are clear examples of strong divergence in cell cycle control between the plant kingdom and Opisthokonts (Cools et al., 2011; Nowack et al., 2012; Kumar et al., 2015; Atkins and Cross, 2018). Therefore, simply observing orthologous sequences in a new genome does not by itself guarantee similar function; experimentation is required. Notably, these divergences are due both to the involvement of distantly related or unrelated gene sequences carrying out the same function and to orthologous sequences that have taken on very distinct functions in different lineages (Cross et al., 2011).
Green algae constitute a large and ecologically important class of photosynthetic organisms, of which land plants are an evolutionary offshoot. Chlamydomonas reinhardtii has conserved some important features found also in animals that were lost in the land plant lineage (for example, cilia and basal body/centrioles). However, Chlamydomonas is clearly a member of the broader plant kingdom; this leads to a very important (but perhaps underexploited) role as a microbial model for land plants (Bisova et al., 2005). The extraordinary utility of yeast models for animal cell biology clearly speaks to the value of this approach overall. Due to accelerated evolution in fungal lineages but not in plant and algal lineages, Chlamydomonas is a substantially better genomic match to land plants than yeasts are for animals.
Ancient genome duplications in the land plant lineage resulted in variable but frequently very high paralog levels (recent gene duplicates with very similar sequences). Paralogs frequently have overlapping functions, severely complicating loss-of-function genetic analysis. However, these genome duplications occurred after divergence from green algae, and most genes in Chlamydomonas are single copy (Bisova et al., 2005). Loss-of-function mutations in single-copy genes immediately reveal the maximum phenotypic consequences, as opposed to the interesting but frequently subtle effects resulting from loss of single paralogs out of large families in land plant genetic studies.
A forward-genetic screen is not dependent on prior knowledge/annotations, allowing for unbiased detection of the critical genes. Random insertional disruptions have been strongly preferred for mutagenesis in Chlamydomonas (recently performed on a systematic scale; Zhang et al., 2014; Li et al., 2016) for three main reasons. First, they have a high probability of producing a knockout of all function, simplifying analysis. Second, insertion of foreign DNA provides a molecular tag for identification of the disrupted gene. Third, the number of insertions per clone is low (frequently only one), so the causative lesion is obvious. However, since insertional mutagenesis mainly produces null alleles, that method in principle cannot yield mutations in the essential gene set. Point mutant screens, with random mutations introduced by chemicals or UV, have the key positive feature, not shared by insertional disruptions, that essential genes can be identified by looking for temperature-sensitive (Ts) lethal point mutations. Ts mutants not only serve as tools for identifying essential functions, but they also provide convenient experimental handles for phenotyping to determine function. This is essential for previously completely unknown genes and is very important even for genes with identified sequence orthologs in other organisms, since sequence orthology suggests but does not prove functional equivalence (Cross et al., 2011).
A drawback to point mutagenesis is that identification of the causative lesion and mutated gene in Chlamydomonas was, until recently, nearly intractable. Extensive mapping was required, sequencing was slow and cumbersome, and random mutagenesis resulted in hundreds or thousands of lesions per clone, only one of which was truly causative. In our previous work (Tulin and Cross, 2014), we developed a method that combines bulked segregant sequence analysis with next generation sequencing to identify causative mutations. Using this method, a UV-induced mutant with a mutation causing a Ts− phenotype was put through a cross to the wild type, and a small pool of Ts− meiotic segregants was collected by tetrad analysis. A DNA sample from this pool was sequenced to high coverage. Reads at the position of the causative lesion were uniformly mutant, while reads at the position of irrelevant point mutations were on average 50% mutant. This method is effective but time-consuming, requiring individual crosses and library preparation for each mutant.
In another study (Onishi et al., 2016), we used a combinatorial sequencing strategy to identify all mutations in a large set of mutants by sequencing only a small number of pools with overlapping membership. Causative lesions were identified from the deconvolved lists of total mutations in each strain based on recurrent mutations of genes from independent strains previously determined to be in the same complementation group. This method is effective, but it is restricted to genetic screens identifying only a small number of complementation groups, since it requires at least two independent driver mutations in each complementation group to be present in the pool.
Here, we introduce innovative methodologies for multiplexed, robust identification of causative lesions for nearly 100 mutants in parallel. This methodology represents a major scale-up of our work developing and sequence-identifying a collection of Ts-lethal mutations in Chlamydomonas, with the aim of approaching saturation (complete identification) of cell-cycle-specific pathways. We started with a greatly expanded pool of Ts-lethal mutations (Supplemental Figure 1). We systematized and improved the combinatorial sequencing approach to determine the complete set of mutations in a large collection of mutants in parallel (Erlich et al., 2009; Onishi et al., 2016). Then, to identify driver mutations from these mutants, we developed a highly efficient multiplexed version of bulked segregant sequence analysis. Unlike the method by Onishi et al. (2016), this method does not require multiple independent alleles and thus is suitable for a screen with a large mutational target size. These methods required us to develop innovative upstream genetic procedures to produce maximally informative sequence libraries, as well as sensitive bioinformatics for sequence analysis to make efficient use of even rare minority reads without confounding our results with inevitable sequencing/alignment errors. Importantly, these methodologies can in principle be applied to a range of meiotic systems, including multicellular diploids, with some modifications.
Using these methods, we broadened our recovery of lesions in genes required for cell cycle progression to an estimated ∼50% of the total; since all biological pathways are multicomponent, this implies that we likely have identified (and have multiple experimental handles to inactivate) nearly all essential pathways contributing to cell cycle regulation. We also identified mutations inactivating broad biosynthetic pathways including general transcription, ribosome biogenesis, and translation, indicating the utility of the methods for investigations beyond cell cycle regulation. The mutants represent a valuable resource for investigation into how cell cycle progression influences any cell behavior of interest. Over 75% of the genes identified have conserved sequence orthologs in land plant genomes; thus, their characterization is highly likely to be informative for essential cell biology across the plant kingdom.
RESULTS
We began the study by isolating thousands of mutants harboring Ts-lethal mutations in essential genes using a highly efficient robotically assisted pipeline (Tulin and Cross, 2014; Breker et al., 2016). In order to enrich for mutagenized genes with potential cell cycle involvement, we used time-lapse microscopy to focus on mutants with little detectable problems with cell growth, but which fail to successfully complete a division cycle (Figure 1; Supplemental Figure 1) (assays and logic previously described in detail in Tulin and Cross, 2014; Breker et al., 2016). This quality control step reduces the number of mutants to be analyzed by a factor of ∼10 to 100, but still leaves many more mutants than could practically be sequence identified by the rather laborious bulked segregant sequence analysis strategy of Tulin and Cross (2014).
Figure 1.
Distinct Arrest Morphologies Differentiate Mutants.
Nitrogen-depleted synchronized samples were plated on TAP agar plates and were transferred to restrictive temperature (33°). Images were taken every hour by light microscopy.
(A) Wild-type sample. Cells grow over the course of the first 10 h and then enter rapid cycles of division and reach clusters of cells, as demonstrated at the 12-h time point, followed by hatching of new born cells.
(B) Some mutants fail in cell growth (Non-growers). Other mutants (Large round) grow but fail to make cleavage planes (CP). Mutants making cleavage planes (arrows) frequently lose cell integrity a few hours later (Tulin and Cross, 2014); some mutants (Arrested clusters) stay morphologically intact. White arrows mark the initiation of CPs. *Due to differences of focal planes of the images of ess123-1, cell size may seem erroneously shrinking.
A Combinatorial Strategy to Sequence Mutants in Overlapping Pools
These UV-mutagenized strains harbor hundreds of mutations scattered randomly across the genome. To determine the complete list of mutations in each individual strain, we employed a combinatorial pooling scheme in which each individual mutant occupies a unique pattern within a set of bar-coded pools (Figure 2) (Erlich et al., 2009; Onishi et al., 2016).
Figure 2.
Combinatorial Framework to Efficiently and Accurately Sequence Pools of Mutants.
(A) A set of mutants is distributed in overlapping fashion across many pools (in the illustration, 28 mutants in eight pools, A–H).
(B) Each mutant has a unique occupancy pattern. Mutant numbers occupying these pools (C and D) are represented in black and gray. Minority reads shared exclusively between two pools allow assignment of a mutation uniquely to the relevant mutant (marked in black).
(C) Representative numbers of mutations identified in a set of eight pools. “All assigned” (single-nucleotide polymorphisms) are the number of total genetic lesions that were identified in the set. “CDS-involved” is the number of genetic lesions that change the annotated coding sequence. The complementary number is the “Non-CDS.” The number in parentheses is the average per mutant.
(D) Example for a detected single-nucleotide polymorphism. Minority reads “C” were identified uniquely in C and D pools (compared with the majority reads “T”) and were assigned uniquely to mutant 14. Together with the adjacent mutation, the lesion (C>T) results in a conversion from leucine to serine in amino acid 151 of CDC20.
While straightforward in principle, this approach has challenges compared with simple sequence determination of a single clone, since it requires analysis of minority reads at high coverage (“minority reads” are reads with an alternative call potentially derived from an authentic mutation, at a position where a majority [in this application, >95%] of the reads correspond to the wild-type reference sequence). The error rate in Illumina sequencing is significant, so at high enough coverage, almost any position will have at least some erroneous calls; these must be distinguished from the minority reads representing a genuine mutation derived from some strain. Sequence coverage of each pool, the mutant occupancy within a set of pools, and number of mutants per pool were all optimized to ensure that minority reads are reliably detected and clearly distinguishable from Illumina/aligner errors (Figure 2A; see Methods). Additionally, we developed quality control measures that eliminated the vast majority of erroneous calls from consideration, based on Illumina quality scores for the position and for the read overall (independent of the position) and number of mismatches (excluding the position of interest) in the read overall. These measures are only partially correlated; combined, they reduce the error rate to ∼3*10−4/base call, while eliminating only a small proportion of the input data.
We aimed for sufficient coverage to allow each mutation in each mutant to give a reliable signal and experimented with varying numbers of mutants and pools (Figure 2A). In the largest experiment, we analyzed 84 mutants in nine pools (each mutant represented in three pools). Statistical analysis (Supplemental Figure 2) indicated >95% accuracy of the assignment of mutations to a single specific mutant (details in Methods).
For direct confirmation, we tested numerous assigned mutations using an allele-specific PCR method (Onishi et al., 2016). In every case tested, this assay showed that the mutation of interest was present uniquely in the assigned strain.
Although the mutants contain only one driver mutation each, the procedure identifies thousands of “passenger” mutations, which can be useful. For example, we found candidate null alleles in two cyclin D genes as passengers. These cyclin D mutations were not lethal (other mutations in the strains were shown to be causative) but may be useful in synthetic lethal genetics and for other purposes. We include Supplemental Data Set 1, with a list of 706 candidate null alleles in passenger mutations, with the thought that they may be useful tools for other studies.
MAPS-SEQ: Meiosis-Assisted Purifying Selection Sequencing
The overlapping pool sequencing yielded a reliable list of mutations present in each mutant. On average, ∼70 of the ∼200 mutations per clone alter coding sequences (missense, nonsense, or splice donor/acceptor site mutations, insertions, and deletions) at locations across the genome. However, tetrad analysis showed 2:2 segregation of Ts−:Ts+ progeny in crosses of ∼90% of the mutants to the wild type, indicating that Ts lethality is due to a single genetic lesion. Therefore, a method was needed to identify the “driver” (single mutation responsible for Ts lethality). Previously, we accomplished this by bulked segregant sequencing (Tulin and Cross, 2014), but this approach was unwieldy at the scale of the this project. We therefore developed MAPS-SEQ, a technique that multiplexes driver discovery, providing a massive increase in efficiency.
A pool of up to 84 mutants, for which all mutations are known from the previous step, is mated (using equal numbers of mutants of each of the two Chlamydomonas mating types) (Figure 3). In this mixture, each mutant can mate at random with each mutant of the other mating type. The mated cells form zygospores committed to undergoing meiosis. Purified zygospores complete meiosis, and haploid products enter mitotic growth, producing a single genetically complex pool of meiotic recombinants (Figure 3A).
Figure 3.
Mass Mating of Ts− Mutants Followed by Selection of Ts+ Meiotic Segregants: Behavior of Causative Mutations and of Linked and Unlinked Passengers.
(A) Sets of 42 Mat+ and 42 Mat– Ts− mutants were crossed in a mass mating (see Methods) and allowed to form zygospores (premeiotic diploids; ∼500 zygospores per mutant). Transfer to light triggers meiosis and the formation of four haploid progeny per zygospore. According to Mendelian inheritance for unlinked lesions, the expected result is that 50% of these haploids will inherit a single Ts-lethal mutation from one parent (mutant A or mutant B), 25% will inherit Ts-lethal mutations from both parents, and 25% will inherit no Ts-lethal mutations and will therefore be Ts+.
(B) Passenger (neutral) mutations should appear in the Ts+ pool at a frequency of the reciprocal of the total number of mutants in the pool (since any neutral chromosome in the pool is equally likely). However, the rare causative lesions should be entirely depleted, and passenger mutations linked on the same chromosome with the causative lesion will be detected at progressively decreasing frequency as the mutation gets closer to the causative mutation. Rarely, very tightly linked passengers exhibit complete depletion and are not distinguishable from linked causative mutations. In the diagram, the red star represents a causative mutation from some mutant; green or orange indicates a chromosome lacking or containing a causative mutation. Mutations from the causative chromosome that are distant from the causative mutation can readily recombine onto a neutral chromosome. The closer the neutral mutation is to the causative mutation, the lower its frequency in the Ts+ pool (schematics at right).
(C) Tetrad analysis of random zygospores of the pools confirms that most parental diploids contain two unlinked Ts− mutations, as expected. PD, parental ditype: 0 Ts+:4 Ts−; TT, tetratype: 1 Ts+:3 Ts−; NPD, nonparental ditype: 2 Ts+:2 Ts−. The net yield of Ts+ is 23% (expected 25%).
(D) Plotting frequencies of all predefined lesions on two chromosomes (12 and 16) within the Ts+ pool (far left) and after computational extraction of the mutations on these chromosomes specific to mutants A and B. For Chr. 12-Mutant B or Chr. 16-Mutant A, all mutations are passengers; Chr. 12-Mutant A and Chr. 16-Mutant B exhibit the pattern indicative of a causative genetic lesion: decreasing frequency along the chromosome, up to complete depletion (a V shape) (compared with expectation sketched in Figure 2B). The expected slope of this V shape is dependent on the mutant proportion within the pool and the conversion between physical and genetic distance. In Chlamydomonas, the conversion is ∼10 cM per 1 Mb (Merchant et al., 2007; Tulin and Cross, 2014), resulting in an expected slope of ∼0.6 for an average coverage of 3%, approximately as observed.
At each chromosomal position, this final pool should contain an approximately equal mixture of every possible version from the original pool (Figure 3B). Mendelian inheritance means that 50% of haploid progeny will be Ts− lethal due to a single lesion, 25% Ts− due to two independent genetic lesions, and 25% Ts+ (Figure 3A); this was indeed observed by tetrad analysis of random zygospores from the pool (Figure 3C). After brief propagation of meiotic products at permissive temperature, we impose several rounds of high-temperature selection. All mutations from the original pool will survive in the selected pool, except for the causative mutations and passenger mutations that are closely linked to them (Supplemental Figure 1B). Based on the number of zygospores produced by the mating mix (estimated microscopically) and the number of mutants that contributed equally to the mating pool, each mutant probably participated in several hundred independent mating events to various other mutants from the pool. Genetic recombination over the resulting hundreds of meioses will allow the separation of even tightly linked passengers. With 25× coverage per mutant in the pool, the limit of resolution is ∼4 cM between passengers and causative mutation (100*1/25), converting to a physical distance of ∼0.4 Mb (Merchant et al., 2007; Tulin and Cross, 2014). Higher sequence coverage should allow resolution down to sub-cM levels, given the large number of potentially recombinant chromosomes in the pool.
We sequence a single DNA preparation from this complex pool to ∼900× overall coverage. Each individual passenger mutation should be present at ∼3% (∼25× coverage). A key advantage of this step is that we can search the aligned sequences only for the mutations predefined as present in each original mutant. This restriction of hypotheses means that even a single mutant read can be informative, unlike the case with de novo sequencing, where multiple concordant reads are typically essential to overcome Illumina error rates over the whole genome. For example, an error rate of 1/3000 (the effective error rate after filtering low-quality positions/reads as described above) over the Chlamydomonas genome at 1000× will result in 4*107 single-read “mutations” purely from sequencing errors. This same error rate applied to just 5700 predetermined mutations will result in no false-positive replicates of the mutation in ∼5000 cases and a single false positive read in ∼700 cases.
Errors in this procedure can come from misassignment of a mutation to the wrong mutant or from stochastic lack of reads covering a passenger mutation. We estimate these to be rare events (<1%) (Supplemental Figure 3).
Identifying Driver Mutations
After computational extraction of the mutations assigned to an individual mutant (Figure 3D), we examine the pattern of mutant read representation across all chromosomes for each mutant. In typical cases, all chromosomes but one show mutant read frequencies at a similar low frequency across the chromosome (the expected representation for passenger mutations), and a single chromosome will exhibit a V pattern of progressive depletion, centered at a region of zero coverage (Figures 3D and 4A).
Figure 4.
Sequencing of Ts+ Meiotic Progenies Pool Allows Multiplexed Identification of Causative Genetic Lesions.
(A) In most (77%) mutants, such a V shape is observed in exactly one of the 17 chromosomes. Mutants were classified as “single-depleted region” mutants as shown. Once the depleted region is identified, close examination of the mutations within this region gave two possible results. In most cases, one or more CDS-changing mutation (marked in blue) was identified and scored as candidate causative mutations.
(B) Mutant Seq4-AH chromosome 12. Arrow indicates candidate causative CDS-changing mutation. This is the only CDS-changing mutation that is completely depleted.
(C) Rarely, no CDS-changing (marked in red circles) mutations were identified in the depleted region (Seq1-CE chromosome 7). The latter constitutes the “unsequenceable” class.
This is the pattern expected for absolute selection against the causative (Ts-lethal) mutation, along with decreasing coselection against linked passenger mutations (negative hitchhiking). A single-depleted-region pattern (Figure 4) was detected in ∼77% of mutants tested (Figure 5A). In most cases, one or a few candidate coding sequence (CDS)-changing mutations are identified at the base of the V shape (Figure 4B). Multiple depleted candidates most likely correspond to a causative mutation with several tightly linked passengers; this is also a limitation of the bulked segregant sequencing strategy employed previously (Tulin and Cross, 2014).
Figure 5.
Representative Yield from Applying MAPS-SEQ to a Large Collection of Ts-Lethal Mutants.
(A) Seventy-seven percent of 340 mutants tested with this procedure were identified with a single depleted (Dep.) region, and for 82% of them, MAPS-SEQ enabled the identification of the exact genetic lesion. Thirteen percent of these were “known genes” (KG), essential genes identified in our previous work (Tulin and Cross, 2014); 50% are “new genes” (NG), candidate essential genes identified in this study; 14% of mutants are “unsequenceable” (US), with a single depleted region, but with no identified CDS-changing mutation. The rest of the mutants (23%) represent poor maters, mutants without depletion, or mutants with multiple depleted regions (PM, ND, and MD, respectively). These are discussed in the text.
(B) Within the new genes group, 90% have a single candidate causative lesion, and for 10%, we have detected more than one candidate lesion (usually two), likely representing a tightly linked passenger(s) (see Figure 3).
For ∼14% of the single-depleted-region mutants, we detected no CDS-changing mutation within the depleted region, despite an unambiguous V shape implying the presence of a severely deleterious mutation at its base (Figure 4C). This indicates failure to detect the causative mutation by Illumina sequencing. Importantly, the frequency of this “unsequenceable” phenomenon is not notably different with the present multiplexed approach than with the mutant-by-mutant bulked segregant sequencing approach (Tulin and Cross, 2014) (see Methods).
There are two other possible outcomes. In rare cases, all mutations from the mutant are represented; there is no depleted region across the genome. In a few such cases we examined in detail, the source of this problem was shown to be a synthetic phenotype: Two unlinked mutations in the original mutant were required for Ts lethality. In other cases, few or no mutations from a given mutant are present in the pool. This is most likely due to a poorly mating mutant that therefore contributes very few of its mutations to the zygotic pool, neither causative nor passengers. This overall low representation can cause spurious regions of apparent depletion simply due to statistical fluctuations, but this is clearly recognizable, since mutations on all chromosomes from such mutants are severely underrepresented. This problem will interfere with the identification of causative mutations that reduce the efficiency of mating or meiosis. We manually curated the lists of assigned mutations to identify candidate lesions of interest based on annotations, but this procedure was generally not fruitful due to the high number of candidates.
We applied MAPS-SEQ to ∼350 mutants containing mutations blocking cell division and identified their causative lesions (Figure 5). Table 1 shows functional annotation analysis of the new genes detected in this study. These genes span a broad range of cell biological pathways (for a complete list of mutants and identified genes, see Supplemental Data Set 2).
Table 1. MAPS-SEQ Has Broad Applicability for the Discovery of Essential Genes Functioning in Diverse Biological Pathways.
Functional Annotation | No. of Genes | Examples of Likely Orthologs Defined by BLAST |
---|---|---|
Cell cycle machinery | 41 | APC1, APC2, APC3, APC4, APC6, APC8, CDC20, AUR2, SMC1, SMC2, SMC3, SMC4, POLA, POLD, POLE |
General transcription | 9 | POLII, SPT16, TFIIH, SPT6 |
Splicing | 4 | SRPK, PRP4, PRP8 |
Translation | 3 | GTS2, DOH1, TSK2 |
Protein modification | 3 | UBC12, SENP7, CLPP4 |
Stress/protein folding | 9 | IRE1, HSF, CCT1, SNF1 |
Glycosylation/secretion | 11 | UDP-galactose and -rhamnose epimerase, UDP-sugar transport, PGI, NSF |
Transporters | 3 | Magnesium transporter, sulfate transporter |
Other/unknown | 19 | GIF3, DUO3, clathrin, NUP205, Exportin-2, MIDASIN, BSL1 |
Classes of functional annotations of the genes newly discovered as sequence-identified Ts-lethal alleles in this work, the number of identified genes in each class and examples of their land plant orthologs are shown.
Evaluation of Causality Identification
Identification of the characteristic V pattern provides prima facie evidence in favor of the causative mutation mapping to the base of the V, leading to the hypothesis that (one of the) CDS-changing mutations found at the base is causative. We have tested this in multiple ways.
We employed genetic tests for 27 mutants. For some, we tested cosegregation of the candidate mutation with Ts lethality in a cross to the wild type using allele-specific PCR (Gaudet et al., 2009). In all cases tested, we observed the expected cosegregation, implying close linkage of the causative mutation to the candidate mutation. Second, some candidate causative lesions are in genes previously identified as bearing independent causative mutations. In such cases, we test the new mutant for failure to complement the previous mutation in heterozygous diploids and for failure to recombine to produce Ts+ progeny in meiosis (Breker et al., 2016). Third, we can select for reversion of temperature sensitivity and determine whether the revertants alter the coding sequence at or near the site of the candidate mutation, implying that this mutation was causative. We isolated revertants for predicted mutants in CYCB1 (CYCLIN B) and in CDC20 (CELL DIVISION CYCLE20) (Figure 2D) and found intragenic reversion (Supplemental Figure 4), confirming causality. Information about gene validation is included in Supplemental Data Set 2.
Finally, to determine the reliability of the whole population of assigned causative mutations, we examined statistical features of the candidates compared with the pool of passenger mutations. We showed previously (Tulin and Cross, 2014; Cross et al., 2017) that causative mutations leading to Ts lethality were systematically more likely than passengers to fall in regions of protein sequences conserved in the land plant Arabidopsis thaliana; additionally, causative mutations are on average more “severe” (using the Blosum62 scoring matrix; Henikoff et al., 2002) than passengers.
The large majority of mutations are sure to be passengers, which were therefore not depleted in the Ts+ pool. Consistently, undepleted mutations were preferentially found in higher numbered BLAST/Blosum categories due to less severe mutation and/or low sequence conservation (Figure 6).
Figure 6.
Conservation and Evolutionary Constraints Confirm the Likelihood of Candidate Causative Lesions.
As described (Tulin and Cross, 2014), we classify mutations according to sequence conservation and severity. Category A: mutation falls within a segment of BLAST alignment to Arabidopsis (HSP) and alters a conserved residue (Blosum62 > 0 for Chlamydomonas versus Arabidopsis) within this segment. Category B: mutation falls within an overall conserved region but alters an unconserved residue (Blosum62 ≤ 0). Category C: mutation lies N-terminal or C-terminal to all detected HSPs. Category D: mutation in gene with no Arabidopsis BLAST hit. Bar graphs demonstrate the distribution of the identified mutations in this work according to the classified groups. Within each category, mutations are classified according to severity: severe mutation (Blosum62 <−1); less severe mutation (Blosum62 ≥−1).
(A) Distribution of all mutations (including both the depleted ones from the Ts+ pool and presumably passenger mutations that appear in the pool).
(B) Distribution of mutations depleted in the Ts+ pool (potential drivers) within genes that were previously verified to be essential (Tulin and Cross, 2014) (known genes).
(C) Distribution of mutations depleted in the Ts+ pool that are in genes newly identified in this work (new genes).
(D) Distribution of depleted mutations for mutants that were assigned with more than one candidate (in nearly every case these mutations are closely linked). The distribution in (D) suggests bimodality (a combination of the patterns in [A] and [B]), as would be expected for a mixture of driver and linked passenger mutations in this class (see text).
(E) The essential genes set (middle) is strongly enriched (78%) for orthologous genes in Arabidopsis compared with the total Chlamydomonas proteome (left) (P < 0.00001). In contrast, the nonessential gene set (right; defined as genes with passenger mutations that are candidate null alleles; Supplemental Data Set 1) is depleted for orthologous gene content compared with the total proteome (left) (P < 0.00001).
Among the depleted mutations (candidate drivers), some fell in genes that we identified previously as yielding Ts-lethal mutations (Tulin and Cross, 2014). We take these mutations as highly likely to be causative (as noted above, where tested, this was confirmed by complementation testing against the previous isolates). These mutations are highly enriched in the BLAST/Blosum categories, indicating more severe mutation/high sequence conservation, consistent with previous findings (Tulin and Cross, 2014; Cross et al., 2017) (Figure 6). This discrimination thus provides a statistical test for depleted mutations in genes not identified in our previous study. These showed a BLAST/Blosum distribution very similar to the mutations in the previously verified genes and very different from the bulk population of presumed passenger mutations. This suggests that true causative mutations in previously unidentified genes are highly enriched in the depleted set.
Finally, in some cases, more than one closely linked CDS-changing mutation was depleted. Most likely, one of these is causative and the others are closely linked passengers. The depleted mutations from these mutants gave an apparently bimodal distribution among the BLAST/Blosum categories that suggested the presence of both causative and passenger mutations. For almost all such mutants, just one of these linked candidates was in one of the lower numbered (higher causation probability) categories; it is very likely that this one is causative. This idea was tested directly by genetic means (reversion or complementation testing; see above) for two such mutants and found to be correct.
From this statistical analysis, we conclude that the depleted mutations are very strongly enriched in the causative subset, and in occasional cases where more than one candidate is detected by depletion, analysis of BLAST/Blosum category will identify the causative one with high probability, consistent with previous results (Tulin and Cross, 2014).
Phenotypes of Gene Inactivation
Because the primary biological focus of the screen was cell cycle regulation, we used DNA flow cytometry of synchronized cells as a primary classifier of newly isolated mutants. Chlamydomonas has a long G1, during which cells can grow more than 10-fold, followed by multiple rapid cycles of DNA replication and mitosis. These division cycles occur within the mother cell wall; after approximately four to five divisions, the 16 to 32 daughters stop dividing, develop flagellae, and hatch from the mother cell wall to begin another growth cycle. Since the mutations are temperature conditional, phenotypic analysis of temperature-shifted cultures provides an immediate view of functional consequences of inactivation of these genes. Therefore, DNA flow cytometry on wild-type cells in the middle of a partially synchronous division phase results in a mix of large (undivided) cells with 1C, 2C, 4C, 8C, and 16C DNA content and small newborn hatched daughter cells with 1C DNA content (in the latter, cells have left-shifted apparent DNA content due to small cell size). We chose a single time point (14 h restrictive temperature) since in a preliminary wild type time course, the highest proportion of dividing cells was observed at that time. Two-parameter flow cytometry of DNA content by Sytox staining and measurement of forward scatter (reflecting cell size) gives characteristic patterns that with appropriate gating can yield a quantitative assessment of all these populations (Figure 7). These gatings, when applied to mutants that have just reached their arrest points, then provide a quantitative readout of the arrest. For example, the cdc20 (APC-activating subunit) mutant yields a predominant accumulation of large cells with 2C DNA content (Figures 7 and 8).
Figure 7.
Functional Grouping of Mutants Based on Hierarchical Clustering of DNA Content Profiles.
Dot plots of DNA content (intensity of SYTOX green staining) and cell size measured by forward scatter (FSC). Gates are applied to extract the number of cells in each DNA peak. Gate colors: gray, newborns; cyan, 1C large (pre-S phase); red, 2C; orange, 4C; purple, 8C; brown, 16C; black (dotted circle), cell aggregates.
(A) In a wild-type sample there are multiple peaks of DNA content, reflecting sequential rounds of replication.
(B) The cdc20 mutant arrests with mostly 2C.
(C) Numbers of cells in each gate are extracted; accordingly, each sample has a characteristic vector of numbers.
(D) Hierarchical clustering of all samples based on the DNA content profiles. Letters on the right indicate the definitive groups that emerge from the clustering. Letters below the clustergram indicate the discreet peaks of DNA content. NB, newborn; Agg, aggregates; Db, debris; Int, intermediates. The color key on the left provides the standard deviations from the mean (=0) for the normalized values in the clustergram.
Figure 8.
Mutants in APC Subunits and Some DNA Replication Factors Arrest after the First Round of Replication.
DNA-stained samples were analyzed by flow cytometry, and gates were applied as described in Figure 7. Hierarchical clustering of all samples tested reveals distinct functional groups, among them arrested samples at 2C that are mostly composed of APC subunits (A) and DNA replication factors (B). NB, newborn; Agg, aggregates; Db, debris; Int, intermediates. The color key on the left provides the standard deviations from the mean (=0) for the normalized values in the clustergram.
We used this vector for hierarchical clustering of 187 independent Ts-lethal mutations (Figure 7C) and observed strong clustering based on a few simple patterns (discussed below; complete data for all mutants tested are available in Supplemental Data Set 3). Similar results were obtained with a different flow cytometry method (Supplemental Figure 5E).
In this data set, approximately half of the mutants were represented by two siblings from backcrosses to the wild type. These “pairs” allow an estimate of reproducibility for a given mutation to be obtained. As shown in Supplemental Figure 6, pairs of siblings are much closer to each other in Euclidean distance than are random pairs, supporting the reliability of the assay. This finding implies that despite the ∼70 coding sequence changes in an average mutant clone, in most cases, only one change has significant phenotypic consequences; the remainder are near neutral. In some cases, we do detect phenotypic alterations due to moderately deleterious background mutations. The most typical such result is a moderate slowing in cell cycle progression; the final phenotype is almost never affected. This slowing does result in some cases in mutants being clustered with lower replication levels than achieved in their final arrest or in the arrest of backcrossed strains; some examples are noted in Supplemental Data Set 1.
Different mutant alleles in the same genes were also closer in Euclidean distance than random pairs, indicating reproducible phenotypes of independent mutations in most target genes; however, these distances were greater than between siblings, suggesting (as is generally the case) that different alleles may inactivate gene function to different degrees (Supplemental Figure 6).
Overall, the hierarchical clustering yielded eight definitive clusters, as shown in Figure 7C. The three most occupied groups are the 1C, 2C, and >2C (Table 2). Additional groups are the “Intermediates,” with high frequency of cells with intermediate signal between the discrete DNA peaks: 1/2/4C and 1/2C, which arrest with a mixed population in few DNA peaks (1C, 2C, 4C or 1C, 2C, respectively) and “Fragiles” (Db) for which much of the sample is cell fragments. (Separately, we determined that the fragile mutant cells on an agar plate were very easily disrupted by gentle manipulation with a dissecting needle.)
Table 2. Hierarchical Clustering Basted on DNA Content Yields Distinct Functional Categories within the Chlamydomonas Cell Cycle Mutants.
DNA Content | Typical Annotations | Gene (No.) | Orthologous Genes in Arabidopsis |
---|---|---|---|
1C | Growth related; DNA replication | 50 | 76% |
2C | APC subunits; DNA replication; CDKB | 13 | 84% |
4C/8C | Spindle formation | 25 | 80% |
Clustering all tested mutants according to their DNA content profile provides three major functional groups, each highly occupied. Listed in the table are the groups, including the number of genes in each one, and their functions as suggested by characteristic annotations. Mutants that had an intermediate phenotype were counted in the nearest category that fits their DNA profile (1/2C were either included in 1C or 2C; 1/2/4C were included in 4C/8C).
The 1C Cluster
The largest group by far accumulated almost exclusively cells with 1C (unreplicated) DNA at the blocked stage. Previously (Tulin and Cross, 2014), we noted that many G1-arresting mutants exhibit detectably slowed cell growth compared with the wild type, although the mutant cells fail to divide even after eventually achieving wild-type division size. This was true of many genes in the 1C cluster reported here as well. Thus, for some of these genes, the block in G1 could reflect a general monitoring of cell size/growth rate governing G1/S, so that slowing cell growth could cause G1 arrest in a relatively nonspecific manner. For other mutants, G1 arrest could reflect a specific involvement of the mutated gene in a novel regulatory pathway.
A mutation inactivating the cyclin-dependent kinase homolog CDKA1 was shown previously to strongly delay (but not block) cell cycle initiation (Tulin and Cross, 2014). cdka1-1 cells grow at normal rates during this elongated G1, reaching very large cell size (much larger than the 1C class mutants described above). Intriguingly, cdka1-1 is synthetic lethal at permissive temperature with a subset of category 1 genes (Atkins, 2018), as well as with a null mutation in the MED6 subunit mediator complex gene (Tulin and Cross, 2014), suggesting a functional relationship between some general cell growth pathways and CDKA1 function.
The 1C class also contained mutants that could carry out cell cycle initiation, as reflected by the formation of incipient cytokinetic cleavage furrows (Tulin and Cross, 2014) (which occurs at the time of the first division in the wild type) but a specific failure of the initiation of DNA replication. Eight genes annotated for a role in DNA replication fit this description (Table 2; e.g., the helicase genes RECQL4 and RIR1 [encoding the large subunit of ribonucleotide reductase]).
The sole Chlamydomonas cyclin B gene CYCB1 contributed to the 1C cluster, but in a complex manner. Different CYCB1 alleles were variably classified as 1C, 2C, or 4C (Supplemental Data Set 1). This probably reflects multiple roles of CYCB1 in promoting DNA replication and perhaps independently promoting mitosis, and perhaps also regulating DNA rereplication (Atkins and Cross, 2018). In yeast, the six B-type cyclins carry out all these roles, although with substantial division of labor (Bloom and Cross, 2007).
The 2C Cluster
The 2C cluster (most arrested cells with 2C DNA content) includes almost all mutations recovered in components of the anaphase-promoting complex (APC): mutations in APC2, 6, and 8 and CDC20, with multiple alleles in several cases (Figure 8). In a separate work, we found that when these mutants become arrested, they exhibit nearly uniform metaphase spindles (Atkins and Cross, 2018).
In yeast, almost any mutation that prevents the assembly of a functional mitotic spindle results in metaphase arrest with once-replicated DNA because the spindle assembly checkpoint inactivates APC-CDC20. As discussed in the next section, such checkpoint controls appear much less effective in Chlamydomonas, accounting for the high specificity within our screen of 2C arrest to APC mutations.
Some mutants in the 2C cluster are annotated for roles in DNA replication; we assume these are partially functional alleles that manage at least some replication before arrest. Additional genes in the 2C cluster lack annotations that are immediately informative as to their functional connection(s) to replication or mitosis; these are subjects for future work.
Interestingly, all APC mutations, but no other mutations in the 2C cluster, strongly blocked cleavage furrow formation. This was noted previously for one such mutation (cdc27-6; Atkins and Cross, 2018). It is currently unknown what APC target might regulate furrowing; CYCB1 was suggested as a candidate (Atkins and Cross, 2018).
>2C Mutants and Ploidy Control
Some mutants accumulated >2C DNA content when blocked at nonpermissive temperature. In most eukaryotic cells, DNA replication occurs in strict alternation with nuclear division, whether cells are mono- or multinucleate, maintaining control of ploidy at the level of individual nuclei. (An exception is found in endoreduplicating tissues, with multiple rounds of DNA replication without nuclear division.) Strict alternation means that DNA content per nucleus will range between 1C and 2C, independent of the number of nuclei in a cell; multiple rounds of replication without nuclear division will result in nuclear DNA content >2C. Since Chlamydomonas cells in the rapid division phase stay within the mother cell wall, a given cellular DNA content can correspond to one or many nuclei (e.g., a 4C cell could have four 1C nuclei, two 2C nuclei, or one 4C nucleus; only the last case provides evidence of loss of ploidy control). To resolve this ambiguity, we used the Amnis imaging flow cytometer, which captures fluorescent images of each individual Sytox-stained cell, allowing a count of nuclei in the cell while simultaneously quantifying total cellular DNA content (Figure 9; Supplemental Figures 5 and 7).
Figure 9.
Imaging Flow Cytometry Used to Determine a Ploidy Ratio.
(A) Diagram of the Chlamydomonas cell cycle.
(B) and (C) Wild-type (B) and spc25 (C) samples analyzed by image stream provide a similar pattern of discrete peaks, suggesting a complete rounds of replication, as analyzed by conventional flow cytometry. The table provides the values for the average number of nuclei (AveNuc.) under each peak and the calculated DNA content per nucleus in each peak accordingly. Representative images for the wild type demonstrate the nuclei distribution and size within one cell before and after segregation (segregation was essentially unobserved in the spc25 mutant at any nuclear ploidy level).
(D) Distribution of DNA content per nucleus calculated for all mutants. Black arrow marks wild-type value.
(E) Mutants whose nuclei have average DNA content >2C were enriched for microtubule synthesis and spindle formation components (Supplemental Data Set 2; orange diamonds in [F]). Their DNA content per nucleus in each peak increases with each DNA replication round (colored lines), unlike the wild type, which maintains normal nuclear ploidy throughout the replication cycles (black lines).
(F) For sensitive detection of defective ploidy control, we examined 71 mutant samples for which at least ∼10% of cells were in the 4C peak. We examined images captured by Amnis for these 4C cells specifically and determined the proportion of mononucleates. These images provide one orientation and one plane of focus from a cell; a binucleate cell will necessarily get called as mononucleate at some frequency. We attribute the large signal at ∼0.2 to this background; therefore, a signal above 0.25 is likely specific to a defect in ploidy control. Some of the discussed genes are marked. Orange diamonds correspond to mutants in (E) with average DNA content >2C per nucleus.
For wild-type Chlamydomonas, we observed an average ratio of DNA/nucleus of <1.5 C, averaging over cells with total cellular DNA contents of 1C, 2C, 4C, and 8C (Figure 9). This indicates strict alternation of DNA replication and nuclear division, as shown previously by microspectrophotometry (Coleman and Maguire, 1982). We tested 177 mutants in this assay and observed ploidy ratios <2 for almost all mutants (Figure 9D). However, mutations in 15 genes resulted in severely defective ploidy control; for example, the spc25 mutant (Figure 9C) goes through repeated rounds of DNA replication with no nuclear division, resulting in a ploidy ratio of 4.6. Most of these mutants had lesions in genes for which annotation suggest a function in structural formation of mitotic structures (spindle and condensed chromosomes; for example, SPC25 has homology to a known kinetochore component) (Supplemental Data Set 1). We have checked a subset of these mutants and found that they lack normal spindle structures at their blocks (data not shown), confirming previous results that suggested a weak spindle checkpoint (Tulin and Cross, 2014). Most mutants in the high-ploidy cluster form one or more apparent cleavage furrows, as judged by light microscopy, indicating independence of initial cytokinesis from spindle formation.
It is interesting that mutants with lesions in two subunits of condensin, SMC2 and SMC4, are in the high-ploidy class. This may reflect the involvement of the condensin complex in the construction or functioning of the spindle (Wignall et al., 2003).
Ploidy Control and Cyclin-CDK Regulation
Three other genes exhibited a suggestive high-ploidy signal, though less extreme than that observed with the spindle assembly mutants. These genes encode the plant-specific cyclin-dependent kinase CDKB, the CKS1 subunit of CDK-cyclin complexes, and the APC1 subunit of the anaphase-promoting complex (Supplemental Data Set 1). CDKB is the main mitosis-inducing CDK in Chlamydomonas (Tulin and Cross, 2014). The cdkb1-1 allele described previously arrests robustly with a single nucleus and once-replicated DNA (Tulin and Cross, 2014). The cdkb1-2 and cdkb1-3 mutants tested here show leakage through this block: Some cells replicate a second time to 4C, and these 4C cells were mostly mononucleate, indicating partially defective ploidy control.
Based on the imaging flow cytometry data, we examined all mutants with a significant 4C population and determined the proportion of those 4C cells that appeared mononucleate (as noted above, this is a hallmark of failure of ploidy control, since a 4C mononucleate cell must come from two rounds of replication without intervening nuclear division). This is a more sensitive quantitative assay for ploidy control than the overall ploidy ratio, since it focuses only on the subset of cells that get to 4C (and thus have the potential to exhibit ploidy control failure). In addition to the mutants already described, this sensitive assay detected sporadic failure in ploidy control in cells mutant for APC4 and APC6, since the rare 4C cells frequently contain only single nuclei (Figure 9F).
As noted above, we isolated multiple alleles in the sole Chlamydomonas cyclin B. Several CYCB1 alleles clustered with the 1C-arresting mutants, suggesting a role for cyclin B in replication initiation (Supplemental Data Set 1; Atkins and Cross, 2018). However, other alleles accumulate higher DNA content (2C and 4C); these alleles also showed a low level of defective ploidy control (cycb1-1 and cycb1-8; Figure 9F).
Thus, in addition to mutations affecting spindle morphogenesis, ploidy control defects were highly specific to cyclin B, CDKB, and the APC. The cks1-1 mutant also had a partial ploidy control defect; CKS1 is an essential subunit of CDK complexes in yeast and animals, and it was recently shown to play a direct role in activation of the APC by cyclin B/CDK1 (Fujimitsu et al., 2016; Qiao et al., 2016; Zhang et al., 2016). In yeast and animals, replication origin loading with the MCM helicase complex is a key step in controlling DNA replication. CDK complexes both promote replication initiation from loaded replication origins and block reloading of replication origins after initiation (Bell and Dutta, 2002). This system may couple a single round of DNA replication to a single cycle of CDK activity. In animals, the APC also directly regulates origin reloading by promoting proteolysis of the loading inhibitor geminin (McGarry and Kirschner, 1998). It is unknown how replication origin reloading is controlled in the plant kingdom; our results suggest a complex role for CYCB1/CDKB1 (perhaps in complex with CKS1) and the APC in regulating replication and ploidy control.
Insights from Annotations: “Mystery Genes”
Most of the genes in the mutant collection are reasonably expected to directly affect cell cycle regulation based on likely function, as suggested by alignment and annotations, compared with results for similar or orthologous genes from fungi and animals (e.g., ORC and DNA polymerase subunits, cyclin B, γ-tubulin ring complex subunits). However, this is not the case for a minority of detected genes (Supplemental Data Set 4 lists 47 such genes, combining results from this work and from Tulin and Cross [2014]). 38 of these 47 genes are in the G1 cluster. Nine of these have little detectable growth deficit, but still arrest quantitatively with 1C DNA content (examples: Cre09.g400478 [phosphatidylinositol 4P-5 kinase PIP5K9 ortholog] and Cre06.g278197 [desumoylating protease SENP7 ortholog]).
The mutants we analyzed were collected based on Ts lethality, then screened for near-wild-type cell growth proficiency combined with failure to complete division. These simple criteria were chosen deliberately to avoid being too restrictive. However, as a consequence, some of the mutated genes may be needed for essential processes not directly related to the cell cycle, but pass the criteria, for example, due to phenotypic lag allowing significant cell growth before arrest. Further work will be required to evaluate how directly the listed genes are involved in cell cycle regulation.
In any case, it is surprising that based on annotations, a large majority of genes identified by these simple criteria have been clearly implicated in cell cycle regulation or execution based on prior work in fungi, animals, and/or land plants. This global finding implies that the gene set for core cell cycle regulation is largely unchanged since the time of the last common ancestor of animals and plants, very early in eukaryotic evolution (Rogozin et al., 2009). Because our procedure for identification of causative mutations is annotation independent, plant-kingdom-specific cell cycle regulators should have been detected with equal efficiency to regulators conserved across eukaryotes. Plant-kingdom-specific regulators do exist (e.g., the serine/threonine-protein phosphatase BSL1 and CDKB1; Tulin and Cross, 2014; Atkins and Cross, 2018) but represent only a small proportion of the total.
It is important to note that sequence homology, or even orthology, does not allow findings from yeast and animals to be directly imported to the plant kingdom: Orthology is neither necessary nor sufficient for equivalence of function (Cross et al., 2011). The conditional mutants obtained here will be important tools to evaluate this. For example, while CDK1 is the proximal activator of mitosis in fungi and animals, its ortholog CDKA is dispensable for mitosis in Chlamydomonas and in land plants, instead functioning primarily in early cell cycle commitment; mitosis is instead promoted by the plant-kingdom-specific CDKB (Nowack et al., 2012; Tulin and Cross, 2014; Atkins and Cross, 2018).
Conservation of the Identified Genes in Land Plant Genomes
Chlamydomonas has the potential to serve as a powerful microbial model for multicellular land plant cell biology. For this to be an effective approach with the temperature-sensitive lethal mutations found here, it is important to have an understanding of the evolutionary conservation of the genes and their functions. While functional conservation is hard to address without detailed characterization, sequence conservation can be quantified. The Arabidopsis genome is the best understood among land plants, so we concentrated on this comparison.
Best reciprocal BLAST analysis (Remm et al., 2001) identifies candidate orthologs, and the DOE Phytozome website has a genome-wide ortholog database generated with this method for Arabidopsis versus Chlamydomonas (https://phytozome.jgi.doe.gov/pz/portal.html). Only 21% of Chlamydomonas proteins are reported as having Arabidopsis orthologs; however, ∼75% of the genes identified here by Ts-lethal mutations encoded proteins in this subset. This striking enrichment confirms previous findings on a smaller sample of genes (Tulin and Cross, 2014; Cross et al., 2017) and strongly supports the idea that the essential gene set in Chlamydomonas provides a good model for cell-essential genes throughout the plant kingdom.
The frequent existence of paralogous sequences (gene duplicates) in land plant genomes but not in Chlamydomonas was noted in the Introduction. Consistently, almost half of the genes that we identified based on Ts-lethal mutations align by best reciprocal BLAST analysis to multigene families with high sequence conservation within the Arabidopsis genome (Supplemental Data Set 1). Genetic redundancy (very common with paralogous sequences) would then imply that in these cases, even complete null alleles of individual family members in Arabidopsis might have very mild or no phenotype (indeed, this is a very common observation in Arabidopsis, where combinations of multiple paralog knockouts are frequently needed for strong phenotypes suggesting function). Thus, approximately half of the cell-essential gene set overall might be extremely difficult to identify genetically in Arabidopsis solely for reasons of genetic redundancy, whereas this problem is almost nonexistent in Chlamydomonas for the orthologous gene set.
DISCUSSION
In this work, we screened many thousands of Ts-lethal mutations, selected ∼350 for detailed characterization, and applied the MAPS-SEQ method to identify 260 genomic regions and 150 candidate essential genes, many identified in multiple alleles. We also identified likely null alleles (premature stop codons or strong splicing mutations) in ∼700 nonessential genes. In previous work with point mutants, identification of causative mutation/gene has been a serious bottleneck. Compared with bulked segregant sequence analysis, which was the most efficient method from previous work (Tulin and Cross, 2014), MAPS-SEQ works with comparable accuracy to one-at-a-time bulked segregant sequence analysis, with (at least) dozens of mutants tested in a single experiment. The reason it is effective is that in bulked segregant sequence analysis, at every position the reads are either all wild type (almost always the case) or rarely will show some representation of a mutant sequence. Most positions are simply wild type, sequenced over and over for mutant after mutant. This is wasteful. In MAPS-SEQ, all mutants in the pool contribute representations of their mutations at each position, so many more positions are diverse and informative. In three other respects, MAPS-SEQ is better than bulked segregant analysis. First, mating and tetrad analysis is required for each individual mutant with bulked segregant analysis, while with MAPS-SEQ, one mating reaction and no tetrad analysis is required. Second, with 10 Ts− segregants pooled (the number we used previously), there is considerable variability in representation of any mutant region due to statistical variation in genetic composition of the ten pool members. With MAPS-SEQ, the total number of meiotic products contributing to the pool is many hundreds, allowing <1 cM resolution at sufficient sequence coverage; in addition, undepleted (neutral) mutations are present at a near-constant proportion (Figure 4). Finally, MAPS-SEQ only moderately increases needed sequence coverage, but drastically reduces labor-intensive upstream genetic manipulations and sample preparation (Supplemental Table 1). This means that the focus can be affordably placed on a broad mutant collection without highly specific criteria for inclusion (an advantage, since such criteria inevitably incorporate biases and incorrect ideas about how the “good” mutants should behave phenotypically).
It is of interest to ask how complete the current collection is with respect to the cell cycle. Zones et al. (2015) list 68 Chlamydomonas genes with annotations for DNA polymerases, the ORC and MCM complexes, the condensin and cohesin complexes, and cyclin-CDKs. We have Ts-lethal mutations in 32 of these genes (47%), distributed across these categories. We assume that this implies a similar ∼50% hit rate for all genes involved in cell cycle regulation, including for genes where annotations so far do not suggest such a connection. Since most biological pathways depend on multiple genes and proteins, availability of Ts-lethal mutations in 50% of cell cycle genes means that almost all such complexes can be inactivated conditionally by one or more independent mutations in the present collection. This should be a valuable resource for future work: The cell cycle impinges on nearly all aspects of cell biology, directly or indirectly, and this collection allows efficient determination of how this happens. Any assayable event can be monitored across a spectrum of Ts-lethal backgrounds blocking the cell cycle at diverse points, to obtain an informative first look at dependency relationships between cell cycle regulation and other events of interest (e.g., regulated transcription, photosynthesis, basal body/centriole/flagellar function, protein localization, and chloroplast division).
The MAPS-SEQ method is directly applicable to any haploid system with Mendelian genetics (Chlamydomonas, yeast, and fungi). It also should work with diploid multicellular organisms with modifications (conceptual mating and screening schemes are presented in Supplemental Figure 8). It works with any scorable trait (it is not restricted to essential genes) and will work with mutations in different pathways and different mutant phenotypes, provided these can be independently scored.
Phylogenetic analysis suggests plants diverged very early from nonphotosynthetic eukaryotes (Rogozin et al., 2009). Since the consensus models for most cell biology stem from yeast and animal studies, forward, unbiased screens are important to determine where plant kingdom-specific divergences might be found. A major limitation to most mutant hunts has been the expensive and labor-intensive sequence identification of the causative mutation. This led to the need for careful phenotypic selection of specific mutants for sequencing, meaning that most mutants remained unidentified; if phenotypic selection was biased in some way, then entire classes of mutants might be missed. The methods reported here can alter the balance, allowing the production of a large and unbiased collection of sequence-identified conditional mutants inactivating essential genes. Such a collection synergizes with recently developed systematic insertional disruption libraries knocking out nonessential genes (Zhang et al., 2014; Li et al., 2016), and with the developing ability for targeted genome editing in Chlamydomonas (Baek et al., 2016; Greiner et al., 2017), strongly enhances the utility of Chlamydomonas as a powerful genetic model for plant kingdom cell biology.
METHODS
Strains, Mutant Isolation, and Genetic Characterization
Chlamydomonas reinhardtii congenic strains CC-4402 (mt+) and CC-4403 (mt−) were provided by Susan Dutcher (Department of Genetics, Washington University School of Medicine). Medium was prepared as described (Dutcher, 1995; Harris, 2008). Mutant isolation, complementation, linkage, and reversion analysis was as described (Breker et al., 2016).
Combinatorial Pooling
Individual mutants were grown as patches on agar TAP plates. Patches were scraped off, weighed, and suspended in liquid TAP (2:1 [w/w] TAP:pellet). Next, equal volumes from each mutant were pooled according to their unique occupancy pattern within the set of pools, resulting in ∼1 g of cell pellet for each pool.
Preparation of a Multiplexed Bulked-Segregant Pool
Fresh cultures of each mutant were suspended in nitrogen-free gamete-induction medium (Dutcher, 1995). Equal amounts of each mutant were pooled according to mating type. Following 5 h of gametogenesis in the light (using LED bulbs for illumination at light intensity of ∼4000 Lux on average), cells were mated for ∼1 h in liquid, plated on TAP plates, and incubated 12 to 16 h in light and 5 d in the dark for zygospore maturation. The plates were scraped with a sharp razor to remove most unmated cells and meiosis was induced by 2 to 3 d light incubation. Once haploid progeny began to proliferate, the cultures received three rounds of selection with replating at restrictive temperature 2 d each. (Due to the mixture of mating types in the Ts+ resulting pool, it is important to keep replating before the plates are overgrown to avoid mating within the pool when nitrogen is depleted.) Genomic DNA was extracted from the final Ts+ pool.
Genomic Sample Preparation
Cell pellets were suspended in 8 mL TEN (10 mM Tris, 10 mM EDTA, and 150 mM NaCl, pH 8) with 0.5% SDS and 0.1 mg/mL proteinase K and heated to 50°C for 1 h. Samples were sonicated and incubated at 50°C for additional 1 h. Lysate was extracted twice with 16 mL phenol-chloroform and was precipitated with 16 mL 100% ethanol. The precipitate was rinsed with 70% ethanol and resuspended in TE (10 mM Tris and 1 mM EDTA, pH 8). NaCl was added to 1 M, and the solution was phenol-chloroform-extracted and ethanol-precipitated. The pellet was washed in 70% ethanol and resuspended in 200 μL TE. Samples were treated with RNase A, and purified DNA was fragmented by sonication (Covaris S2 Focused Ultrasonicator) to 300 bp, followed by gel excision of a size range from 200 to 500 bp. Following gel extraction, DNA was end-repaired and dA-tailed. Illumina TruSeq sequencing adapters were ligated to dA-tailed DNA. All enzymatic reactions were performed using New England Biolabs kits. Adaptor-ligated DNA (350 to 600 bp) was purified using carboxylic acid-coated Dynabeads MyOne (Invitrogen). DNA concentration was estimated both by agarose gel electrophoresis and conventional qPCR using seven dilutions (500×–32,000× in 2× dilution steps) of a standard sample. Efficiency of end modification is tested as follows: (1) In a small test ligation, sonicated DNA self-ligates very inefficiently (most ends staggered). End repair results in fragments that self-ligate very efficiently, to concatemers of ∼>5 fragments, as assayed by gel electrophoresis. (2) Conversely, successful A-tailing eliminates self-ligatability, but allows ligation of a T-tailed Y-adapter to the fragments, causing a migration shift of ∼100 bp. Importantly, library preparation was PCR free. Samples with different indices were pooled with approximately equal representation. We estimate >80% conversion of genomic fragments to doubly end-ligated library fragments. These procedures result in significantly more even representation across the genome, although read count is still overdispersed compared with counting error expectations, due to reproducible (sequence-determined) variations in the number of reads detected across the genome (data not shown). Illumina sequencing was performed by Beijing Genomics Institute, New York Genome Center, or GENEWIZ.
Sequencing and Bioinformatics
Overlapping Pool Sequencing
Placing each mutant in two of n pools allows a unique pool pattern for (n*(n−1)/2) mutants. While other patterns of pool occupancy allow more mutants per pool (e.g., using all possible occupancy patterns allows 2n−1 mutants), the two-pools-per-mutant scheme has more favorable error-correcting characteristics (Supplemental Figure 2). For example, in the full pattern procedure, a mutation detected in pools A, B, and C could be genuinely AB and false positive in C, ABCD with a false negative in D, etc. With two mutants per pool, false positive or false negative instances always result in a detectable error, with a minimal number of possible corrections.
We aimed for ∼90 to 100× coverage of each pool and 14 to 16× coverage per mutant. We tried three levels of pooling with 28 and 36 mutants in sets of eight or nine pools, respectively (each mutant appears in two pools and accordingly each pool is composed of 7 or 8 mutants, respectively), and 84 mutants in a set of nine pools where each mutant appears in three pools and accordingly each pool is composed of 28 mutants (Figure 2A). Mutation assignments, as well as successful identification rate of driver lesions, were with similar frequencies in all setups. Assignment of mutations to individual mutants is based on detection of a mutation in exactly the two pools to which the mutant contributed (Figures 1B and 2A; Supplemental Figure 2). In one experiment, we detected ∼5700 genetic lesions in 28 sequenced specimens and uniquely assigned an average of ∼200 to each specimen (Figure 2C). The number of lesions assigned per mutant varied from 87 to 386, due in part to diverse UV dosages during mutagenesis (Breker et al., 2016).
Statistical analysis strongly supported the reliability of the assignments of mutations to strains. Eighty-five percent of the mutations were detected in just two pools as expected, with only 1.4% of the mutations being detected in a single pool. Mutations (13.6%) were assigned to three pools; however, in essentially every case, two of these pools had >90% of the total mutation read count, and the third pool typically had only a single read, deriving most likely from Illumina read error, barcode misassignment or low-level cross-contamination of pooled cell or DNA samples. After discarding such stray single reads, the net result was assignment of >95% of mutations to a single specific mutant based on specific double pool occupancy.
In sequence data from a pool containing a mutant with a lesion at some position, the number of mutant reads should average the total read count at the position divided by the number of mutants contributing to the pool, distributed with Poisson (counting) variability. We found a good fit to this expectation (Supplemental Figure 2).
Overall, these statistical features indicate that mutations were reliably detected in just two pools at the expected frequencies, indicating high-quality assignments. Systematic, reproducible variation in total read count at individual positions, mainly due to variability in recovery of alignable reads, prevented effective detection over at most 10% of the genome (Supplemental Figure 2G).
Bioinformatics
After paired-end Illumina sequencing, reads were aligned to the reference genome using bowtie2 (Langmead et al., 2009) and processed with mpileup (Li et al., 2009) to determine reference and variant calls at each position. mpileup was run without the “base quality adjustment” feature (flag –BAQ), in order to recover raw Illumina quality scores for each call in the output. mpileup output for each chromosome was processed by MATLAB code to a matrix form: six rows (A, C, G, T, deletion, and insertion) and one column for each chromosomal position (code is available upon request). Read count and average quality score for each call at each position was recovered in matrix form. Mutation detection was always performed in comparison of such matrices derived from parental sequence data to mutant pool data. Thus, common errors (from sequencing, alignment, etc.) found in both “subjects” (parents) and “queries” (mutant pools) were disregarded. This matrix form makes detection, quantification, and initial evaluation of mutant reads very rapid using simple MATLAB matrix operations.
This output provided an initial, highly error-prone list of candidate mutations. Errors must be carefully screened out because in these procedures even rare (low minority) calls must be considered. A low Illumina quality score for the called position is a major error predictor, as expected. Another predictive feature is consistency of calls at a position, measured by coefficient of variation of read count at a position, across the six possible calls (low CV correlates with error); another is total read count at a position (positions with low overall read count are prone to errors). We also found a number of indicators not accessible in pileup format: Features of the read in which a candidate mutation call is embedded. These features include mapping quality, average Illumina quality score over the entire read, number of mismatches to reference (excluding the mutant call itself and a few positions to either side), and the proportion of repetitive sequence (indicated as “soft-masked repeat” in the reference genome from Phytozome). These error indicators were only partially correlated (data not shown); therefore, screening out individual reads with conservative thresholds for each indicator vastly decreases errors while removing only a small proportion of total reads. Determining these features requires computational reexamination of the bowtie2 .bam aligner output. For efficiency, we did this reexamination only on candidate positions detected using the matrix format derived from pileup output, screened for number and proportion of mutant calls and CV for subjects and queries.
Assignment of mutation candidates to specific mutants further requires that the call be made either in exactly the two pools the mutant was entered into, or at most one more pool, and that at least 90% of the reads be found in only the two relevant pools. These criteria screened out only a small number of candidates, and they were obvious false positives (a typical example would be a position with one to three mutant reads each in four or five different libraries).
Candidates passing these criteria are almost all correct calls (that is, they represent genuine mutations present specifically in the assigned mutant), based on criteria discussed above; therefore, false positives are rare. Since we recovered the true causative mutation in most mutants, false negatives (true mutations excluded by the criteria) were also relatively rare.
Determination of the Causative Mutations
After construction and sequence analysis of the Ts+ pool, the final output of the pipeline is a MATLAB data structure containing (for each mutant) the occupancy pattern, assigned mutations, and frequency of these mutations in the Ts+ pool sequencing. Using this as an input for a plotting program PlotMutSnps.m, the assigned mutations for each mutant were plotted as shown in Figure 4: x axis is the position on each chromosome and y axis is the fraction of mutation reads in the relevant Ts+ pool.
For each chromosome, least-squares errors were calculated for the hypothesis of no depletion, and for the hypotheses of depletion at some position on the chromosome, along with the expectation of single crossovers at random positions proximal to the site of depletion. These lines were graphed in red or green, respectively, depending which had a better fit. This aided in the visual detection of clearly depleted genomic regions, allowing an initial classification of mutants based on number of depleted regions (0, synthetic lethal; 1, single gene; 2 or more, multiple depletion; all depleted, poor maters). Additionally, each chromosome was plotted separately with the frequency of mutations in each assayed Ts+ pool one next to each other to verify correct assignments. Mutations are represented as red circles (non-CDS-changing mutations) and blue circles (CDS-changing mutations). Identified depleted mutations were checked specifically by clicking on the relevant circle and obtaining a complete report on the position, reads, gene, and mutation effect. All CDS-changing mutations depleted in an identified depleted region were considered to be candidate causative mutations. For a minority of mutants (∼15–25%), we detected a clear V pattern, with no CDS-changing mutation at its base. This can be explained in two ways: first, failure to recover any aligned reads over the critical region, either due to failure of Illumina sequencing to produce reliable reads due to sequence characteristics, or failure of alignment (for example, if the mutation occurs in a repetitive [unmappable] sequence). Another possibility, for cases where a depleted region was identified and only non-CDS-changing mutations were detected, is that some of these changes actually interfere with gene function. For example, some apparently non-CDS-changing mutations might in fact alter coding sequences that are not recognized in the annotation. This was observed previously: a strong loss-of-function mutation mapped to a probable premature stop codon mutation that according to annotation was in the 5′-untranslated region; most likely this region was actually part of the CDS (Cross, 2016). In this work, the CYCB1 mutations were initially missed, as the mutations are in a region annotated as the 3′-untranslated region of the gene adjoining cyclin B1 in tail-to-tail orientation. CYCB1 was annotated as containing only an N-terminal fragment (https://phytozome.jgi.doe.gov/pz/portal.html); however, sequence alignment showed that this proposed 3′-untranslated region contained sequence highly homologous to the C terminus of cyclin B, and transcription data supported inclusion of these sequences in the cyclin B1 coding sequence (https://phytozome.jgi.doe.gov/pz/portal.html; Atkins and Cross, 2018). Another possibility is that some apparently silent mutations could change codon usage to a very rare/absent tRNA (Sørensen et al., 1989; Naya et al., 2001; Cannarozzi et al., 2010; Gingold and Pilpel, 2011; Salinas et al., 2012). A final potential cause of “unsequenceability” is a gene whose sequence is missing in part or in whole from the reference. This is difficult to evaluate at present; it is clear that there is at least some level of missing coding sequence from the current assembly (Tulin and Cross, 2016).
BLAST-Blosum Scoring
We used BLAST-Blosum scoring as described previously (Cross et al., 2017). Briefly, mutations are categorized into four classes by BLAST analysis. Class A: Mutation falls within a segment of BLAST alignment (high-scoring pair [HSP]), and the mutation alters a conserved residue within this segment. Class B: Mutation falls within an overall conserved region but alters an unconserved residue, is BLAST-aligned across a small deletion in the Arabidopsis thaliana sequence in the HSP, or is found between two distinct HSPs. Class C: Mutation is N-terminal or C-terminal to all detected HSPs. Class D: No Arabidopsis BLAST hit. Blosum scoring reflects the likely severity of effect of substitutions to disrupt protein function. We used the Blosum62 score (Henikoff and Henikoff, 1993). Segregating the BLAST classes A to D according to the severity score (mutation has Blosum <−1 or ≥−1) yielded eight informative classes (Cross et al., 2017).
Time-Lapse Microscopy Screens
Freshly growing arrayed mutants organized in 100-block were plated on low-nitrogen TAP plates in the light at 21°C to allow synchronization. Following 2 d of growth, colonies were replicated on TAP plates and light microscopy images were taken every at time 0, 10, 20, and 48 h at 33°C.
The “fragility” category is based on sensitivity of arrested cells on an agar pad to manipulation with a dissecting needle. Wild-type and most mutant cells could be moved without problems. Fragile mutant cells promptly exploded when touched with a needle.
Synchronization
Cells were plated on 0.1× nitrogen TAP plates in the light at 21°C. After 2 d, the cells had completed divisions and arrested as small newborns. Cells were then plated on TAP plates at 33°C with light. After 14 h, wild-type cells were in the middle of the division phase (mix of cells undergoing S phase and mitosis, and newborn cells).
Flow Cytometry
Samples were fixed in 3:1 ethanol:acetic acid (15 min), washed in PBS, incubated in PBS+RNase A (100 μg/mL) at 37°C for 2 h, and resuspended in PBS. DNA was stained using PBS + 500 nM SYTOX Green (Thermo Fisher Scientific). Samples were read (20,000 cells each) in the BD Accuri C6 platform using 90% attenuator to avoid saturation in the FL1 channel (excitation, 488 nm; emission: 530 ± 15 nm). Data were processed and analyzed using a downloadable MATLAB program (Nitai Steinberg; https://www.mathworks.com/matlabcentral/fileexchange/38080-flow-cytometry-gui-for-matlab). Gates for each DNA peak were set according to the wild-type sample and were applied accordingly for each sample. Gates were slightly modified to account for minor variations in peak positions between samples; these were mainly due to a known effect whereby larger cells give a higher signal with the same amount of nuclear DNA (Tulin and Cross, 2014).
Image Stream Analysis
DNA-stained samples were analyzed in an ImageStream X Mark II Imaging Flow Cytometer (10,000 cells from each). Images were then analyzed using IDEAS6.2. In order to distinguish between single dividing cells versus aggregates, many samples were examined manually and two readouts were found to be efficient: area of the object, and the longitudinal/horizontal axes ratio. The gate for singlets used is as follows: area <50 ratio 0.3 to 1 (newborn cells) and 50 < area < 200 ratio 0.75 to 1. A few independent samples were than examined with this gate, and the number of aggregates within singlets gate and number of single cells excluded were counted. Sample 1: 2/250 cells counted were aggregates (total number within gate: 6054 cells), 43/160 cells counted were single cells (total number outside the gate: 1213 cells). Sample 2: 1/240 cells counted were aggregates (total number within gate: 5973 cells), 47/160 cells counted were single cells (total number outside the gate: 1190 cells). Sample 3: 11/250 cells counted were aggregates (total number within gate: 6136 cells), 34/125 cells counted were single cells (total number outside the gate: 1859 cells). Sample 4: 1/250 cells counted were aggregates (total number within gate: 5310), 51/160 cells counted were single cells (total number outside the gate: 1656 cells). Sample 5: 0/250 cells counted were aggregates (total number within gate: 4464 cells), 84/160 cells counted were single cells (total number outside the gate: 1285 cells). In conclusion, between 0 and 4% aggregates were included in the singlets gate, while only 10% of single cells were excluded from final analysis. Singlets gate was applied on all samples tested, and information was extracted and processed using an in-house MATLAB program (available upon request). Moreover, each sample was manually examined and the number of nuclei per dividing cell was counted for 50 to 80 cells from each DNA peak.
Accession Numbers
Accession numbers (Phytozome v.10; http://phytozome.jgi.doe.gov/pz/portal.html) of Chlamydomonas genes discussed in this article are as follows: CYCD2, Cre06.g289750.t1.1; CYCD3, Cre06.g284350.t1.1; CDC20, Cre09.g399911.t1.1; CYCB1, Cre08.g370401.t1.1; CDKA1, Cre10.g465900.t1.2; MED6, Cre11.g467659.t1.1; RECQL4, Cre15.g634701.t1.1; RIR1, Cre12.g492950.t1.2; SENP7, Cre06.g278197.t1.1; BSL1, Cre01.g050850.t1.2; APC1, Cre13.g579100.t1.1; APC2, Cre10.g460532.t1.1; APC4, Cre12.g556650.t1.2; APC6, Cre13.g562950.t1.1; APC8, Cre16.g681578.t1.1; SPC25, Cre16.g658600.t1.1; SMC2, Cre02.g086650.t1.2; SMC4, Cre12.g493400.t1.1; CDKB1, Cre08.g372550.t1.1; CKS1, Cre03.g180350.t1.2; and PIP5K9, Cre09.g400478.t1.1. Accession numbers for remaining genes in which we isolated mutations are available in Supplemental Data Sets 1 to 3. Sequence data (.fastq or .bam files) from this article and in-house software are available upon request.
Supplemental Data
Supplemental Figure 1. A schematic presentation of the mutant hunt and identification of causative mutations.
Supplemental Figure 2. Minority reads are almost exclusively assigned to occupied patterns within the set of libraries.
Supplemental Figure 3. Scrambling strategy confirms correct mutations assignments.
Supplemental Figure 4. Confirmation of causality identification by isolating revertants.
Supplemental Figure 5. High-throughput imaging of DNA stained samples provides an efficient tool to exclude cell aggregates.
Supplemental Figure 6. High reproducibility of DNA profiling assay as demonstrated by pairs of siblings.
Supplemental Figure 7. Ploidy measurements are highly reproducible.
Supplemental Figure 8. Applicability of MAPS-SEQ to diploid meiotic organisms.
Supplemental Table 1. Comparison of MAPS-SEQ to bulked-segregant analysis.
Supplemental Data Set 1. Candidate null allele mutations identified within the set of passenger mutations.
Supplemental Data Set 2. All mutations in essential genes identified in this work.
Supplemental Data Set 3. Raw data of DNA content profiles of all mutants tested by flow cytometry.
Supplemental Data Set 4. Newly identified genes with potential involvement in cell division.
Acknowledgments
We thank the Cross lab members for advice and useful discussion. We thank Kresti Pecani for development and application of PCR-free methods for DNA sample preparation. This work was supported by National Institutes of Health Grant GM07853 to F.R.C., by a Junior Fellow award from the Simons Foundation to M.B., and by The Rockefeller University.
AUTHOR CONTRIBUTIONS
M.B., K.L., and F.R.C. designed and performed research. M.B. and F.R.C. contributed new computational tools and wrote the article.
References
- Atkins K.C. (2018). Cell Cycle Control by Cyclin-CDKs in Chlamydomonas reinhardtii. PhD dissertation (New York: The Rockefeller University; ). [Google Scholar]
- Atkins K.C., Cross F. (2018). Interregulation of CDKA/CDK1 and the plant-specific cyclin-dependent kinase CDKB in control of the Chlamydomonas cell cycle. Plant Cell 30: 429–446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baek K., Kim D.H., Jeong J., Sim S.J., Melis A., Kim J.S., Jin E., Bae S. (2016). DNA-free two-gene knockout in Chlamydomonas reinhardtii via CRISPR-Cas9 ribonucleoproteins. Sci. Rep. 6: 30620. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bell S.P., Dutta A. (2002). DNA replication in eukaryotic cells. Annu. Rev. Biochem. 71: 333–374. [DOI] [PubMed] [Google Scholar]
- Bisova K., Krylov D.M., Umen J.G. (2005). Genome-wide annotation and expression profiling of cell cycle regulatory genes in Chlamydomonas reinhardtii. Plant Physiol. 137: 475–491. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bloom J., Cross F.R. (2007). Multiple levels of cyclin specificity in cell-cycle control. Nat. Rev. Mol. Cell Biol. 8: 149–160. [DOI] [PubMed] [Google Scholar]
- Breker M., Lieberman K., Tulin F., Cross F.R. (2016). High-throughput robotically assisted isolation of temperature-sensitive lethal mutants in Chlamydomonas reinhardtii. J. Vis. Exp. 118: 54831. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cannarozzi G., Schraudolph N.N., Faty M., von Rohr P., Friberg M.T., Roth A.C., Gonnet P., Gonnet G., Barral Y. (2010). A role for codon order in translation dynamics. Cell 141: 355–367. [DOI] [PubMed] [Google Scholar]
- Coleman A.W., Maguire M.J. (1982). A microspectrofluorometric analysis of nuclear and chloroplast DNA in Volvox. Dev. Biol. 94: 441–450. [DOI] [PubMed] [Google Scholar]
- Cools T., Iantcheva A., Weimer A.K., Boens S., Takahashi N., Maes S., Van den Daele H., Van Isterdael G., Schnittger A., De Veylder L. (2011). The Arabidopsis thaliana checkpoint kinase WEE1 protects against premature vascular differentiation during replication stress. Plant Cell 23: 1435–1448. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cross F.R. (2016). Tying down loose ends in the Chlamydomonas genome: functional significance of abundant upstream open reading frames. G3 (Bethesda) 6: 435–446. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cross F.R., Buchler N.E., Skotheim J.M. (2011). Evolution of networks and sequences in eukaryotic cell cycle control. Philos. Trans. R. Soc. Lond. B Biol. Sci. 366: 3532–3544. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cross F.R., Breker M., Lieberman K. (2017). Validated Bayesian differentiation of causative and passenger mutations. G3 (Bethesda) 7: 2081–2094. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dutcher S.K. (1995). Mating and tetrad analysis in Chlamydomonas reinhardtii. Methods Cell Biol. 47: 531–540. [DOI] [PubMed] [Google Scholar]
- Erlich Y., Chang K., Gordon A., Ronen R., Navon O., Rooks M., Hannon G.J. (2009). DNA Sudoku: harnessing high-throughput sequencing for multiplexed specimen analysis. Genome Res. 19: 1243–1253. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Francis D. (2007). The plant cell cycle: 15 years on. New Phytol. 174: 261–278. [DOI] [PubMed] [Google Scholar]
- Fujimitsu K., Grimaldi M., Yamano H. (2016). Cyclin-dependent kinase 1-dependent activation of APC/C ubiquitin ligase. Science 352: 1121–1124. [DOI] [PubMed] [Google Scholar]
- Gaudet M., Fara A.G., Beritognolo I., Sabatti M. (2009). Allele-specific PCR in SNP genotyping. Methods Mol. Biol. 578: 415–424. [DOI] [PubMed] [Google Scholar]
- Gingold H., Pilpel Y. (2011). Determinants of translation efficiency and accuracy. Mol. Syst. Biol. 7: 481. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Greiner A., Kelterborn S., Evers H., Kreimer G., Sizova I., Hegemann P. (2017). Targeting of photoreceptor genes in Chlamydomonas reinhardtii via zinc-finger nucleases and CRISPR/Cas9. Plant Cell 29: 2498–2518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harris E. (2008). The Chlamydomonas Sourcebook: Introduction into Chlamydomonas and Its Laboratory Use. (San Diego, CA: Elsevier Academic Press; ). [Google Scholar]
- Henikoff S., Henikoff J.G. (1993). Performance evaluation of amino acid substitution matrices. Proteins 17: 49–61. [DOI] [PubMed] [Google Scholar]
- Henikoff J.G., Greene E.A., Taylor N., Henikoff S., Pietrokovski S. (2002). Using the blocks database to recognize functional domains. Curr. Protoc. Bioinformatics 2: 2.2. [DOI] [PubMed] [Google Scholar]
- Inzé D., De Veylder L. (2006). Cell cycle regulation in plant development. Annu. Rev. Genet. 40: 77–105. [DOI] [PubMed] [Google Scholar]
- Kumar N., et al. (2015). Functional conservation in the SIAMESE-RELATED family of cyclin-dependent kinase inhibitors in land plants. Plant Cell 27: 3065–3080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langmead B., Trapnell C., Pop M., Salzberg S.L. (2009). Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10: R25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R.; 1000 Genome Project Data Processing Subgroup (2009). The sequence alignment/map format and SAMtools. Bioinformatics 25: 2078–2079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li X., Zhang R., Patena W., Gang S.S., Blum S.R., Ivanova N., Yue R., Robertson J.M., Lefebvre P.A., Fitz-Gibbon S.T., Grossman A.R., Jonikas M.C. (2016). An indexed, mapped mutant library enables reverse genetics studies of biological processes in Chlamydomonas reinhardtii. Plant Cell 28: 367–387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McGarry T.J., Kirschner M.W. (1998). Geminin, an inhibitor of DNA replication, is degraded during mitosis. Cell 93: 1043–1053. [DOI] [PubMed] [Google Scholar]
- Merchant S.S., et al. (2007). The Chlamydomonas genome reveals the evolution of key animal and plant functions. Science 318: 245–250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Naya H., Romero H., Carels N., Zavala A., Musto H. (2001). Translational selection shapes codon usage in the GC-rich genome of Chlamydomonas reinhardtii. FEBS Lett. 501: 127–130. [DOI] [PubMed] [Google Scholar]
- Nowack M.K., Harashima H., Dissmeyer N., Zhao X., Bouyer D., Weimer A.K., De Winter F., Yang F., Schnittger A. (2012). Genetic framework of cyclin-dependent kinase function in Arabidopsis. Dev. Cell 22: 1030–1040. [DOI] [PubMed] [Google Scholar]
- Onishi M., Pringle J.R., Cross F.R. (2016). Evidence that an unconventional actin can provide essential F-actin function and that a surveillance system monitors F-actin integrity in Chlamydomonas. Genetics 202: 977–996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qiao R., et al. (2016). Mechanism of APC/CCDC20 activation by mitotic phosphorylation. Proc. Natl. Acad. Sci. USA 113: E2570–E2578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Remm M., Storm C.E.V., Sonnhammer E.L.L. (2001). Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J. Mol. Biol. 314: 1041–1052. [DOI] [PubMed] [Google Scholar]
- Rogozin I.B., Basu M.K., Csürös M., Koonin E.V. (2009). Analysis of rare genomic changes does not support the unikont-bikont phylogeny and suggests cyanobacterial symbiosis as the point of primary radiation of eukaryotes. Genome Biol. Evol. 1: 99–113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salinas T., Duby F., Larosa F., Coosemans M., Bonnefoy N., Motte P., Maréchal-Drouard L., Remacle C. (2012). Co-evolution of mitochondrial tRNA import and codon usage determines translational efficiency in the green alga Chlamydomonas. PLoS Genet. 8: e1002946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sørensen M.A., Kurland C.G., Pedersen S. (1989). Codon usage determines translation rate in Escherichia coli. J. Mol. Biol. 207: 365–377. [DOI] [PubMed] [Google Scholar]
- Tulin F., Cross F.R. (2014). A microbial avenue to cell cycle control in the plant superkingdom. Plant Cell 26: 4019–4038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tulin F., Cross F.R. (2016). Patching holes in the Chlamydomonas genome. G3 (Bethesda) 6: 1899–1910. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wignall S.M., Deehan R., Maresca T.J., Heald R. (2003). The condensin complex is required for proper spindle assembly and chromosome segregation in Xenopus egg extracts. J. Cell Biol. 161: 1041–1051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang R., Patena W., Armbruster U., Gang S.S., Blum S.R., Jonikas M.C. (2014). High-throughput genotyping of green algal mutants reveals random distribution of mutagenic insertion sites and endonucleolytic cleavage of transforming DNA. Plant Cell 26: 1398–1409. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang S., Chang L., Alfieri C., Zhang Z., Yang J., Maslen S., Skehel M., Barford D. (2016). Molecular mechanism of APC/C activation by mitotic phosphorylation. Nature 533: 260–264. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zones J.M., Blaby I.K., Merchant S.S., Umen J.G. (2015). High-resolution profiling of a synchronized diurnal transcriptome from Chlamydomonas reinhardtii reveals continuous cell and metabolic differentiation. Plant Cell 27: 2743–2769. [DOI] [PMC free article] [PubMed] [Google Scholar]