Skip to main content
eLife logoLink to eLife
. 2019 Sep 9;8:e48119. doi: 10.7554/eLife.48119

Evolution of Yin and Yang isoforms of a chromatin remodeling subunit precedes the creation of two genes

Wen Xu 1, Lijiang Long 1,2, Yuehui Zhao 1, Lewis Stevens 3, Irene Felipe 4, Javier Munoz 5, Ronald E Ellis 6, Patrick T McGrath 1,7,8,
Editors: Erich M Schwarz9, Detlef Weigel10
PMCID: PMC6752949  PMID: 31498079

Abstract

Genes can encode multiple isoforms, broadening their functions and providing a molecular substrate to evolve phenotypic diversity. Evolution of isoform function is a potential route to adapt to new environments. Here we show that de novo, beneficial alleles in the nurf-1 gene became fixed in two laboratory lineages of C. elegans after isolation from the wild in 1951, before methods of cryopreservation were developed. nurf-1 encodes an ortholog of BPTF, a large (>300 kD) multidomain subunit of the NURF chromatin remodeling complex. Using CRISPR-Cas9 genome editing and transgenic rescue, we demonstrate that in C. elegans, nurf-1 has split into two, largely non-overlapping isoforms (NURF-1.D and NURF-1.B, which we call Yin and Yang, respectively) that share only two of 26 exons. Both isoforms are essential for normal gametogenesis but have opposite effects on male/female gamete differentiation. Reproduction in hermaphrodites, which involves production of both sperm and oocytes, requires a balance of these opposing Yin and Yang isoforms. Transgenic rescue and genetic position of the fixed mutations suggest that different isoforms are modified in each laboratory strain. In a related clade of Caenorhabditis nematodes, the shared exons have duplicated, resulting in the split of the Yin and Yang isoforms into separate genes, each containing approximately 200 amino acids of duplicated sequence that has undergone accelerated protein evolution following the duplication. Associated with this duplication event is the loss of two additional nurf-1 transcripts, including the long-form transcript and a newly identified, highly expressed transcript encoded by the duplicated exons. We propose these lost transcripts are non-functional side products necessary to transcribe the Yin and Yang transcripts in the same cells. Our work demonstrates how gene sharing, through the production of multiple isoforms, can precede the creation of new, independent genes.

Research organism: C. elegans

Introduction

There is general interest in understanding how animals adapt to new environments. What are the alleles that matter to positive selection and what sort of genes do they target? Since methods were developed to map and identify the genes harboring causative genetic variation, researchers have often isolated changes in the same gene in different populations or species (Wood et al., 2005; Martin and Orgogozo, 2013). Besides targeting specific genes, evolution can target classes of genes that share molecular features such as biochemical (e.g. chemoreceptor genes; Bachmanov and Beauchamp, 2007; Keller et al., 2007; Wisotsky et al., 2011; Lunde et al., 2012; McRae et al., 2012; McBride et al., 2014; Greene et al., 2016a; Greene et al., 2016b) or developmental function (e.g. master regulators of cell fate; Sucena et al., 2003; Colosimo et al., 2005; Chan et al., 2010; Yang et al., 2018). One molecular feature predicted to be important for evolution is the ability of genes to produce multiple protein isoforms. A single protein-coding gene can produce multiple isoforms using alternative transcription initiation and termination sites combined with alternative splicing between exons (Pan et al., 2008; Pal et al., 2011). Isoform-specific evolution is found throughout vertebrates, including recent evolution of transcript expression in primates (Barbosa-Morais et al., 2012; Merkin et al., 2012; Shabalina et al., 2014; Zhang et al., 2017). Whether the increase in transcriptomic diversity is important for adaptive evolution remains an important question, and only a few examples have shown how isoform evolution could be involved in phenotypic diversity (Mallarino et al., 2017).

The ability of a gene to produce multiple protein isoforms might also play a role in the genesis of new genes. Over long evolutionary timescales, gene duplication and diversification can create paralogous genes with different functions (Ohno, 1970; Innan and Kondrashov, 2010). One central mystery in this process is the order of these two events; do mutations that duplicate genes occur first or does functional diversification preclude the duplication event? One mechanism the latter route can happen through is by gene sharing, or the ability of a gene to create multiple protein products (or a single protein product) that have two or more distinct functions (Hughes, 1994). If each isoform acts in different tissues or plays distinct roles in biological processes, subsequent duplication mutations can result in the separation of these isoforms into two distinct genes.

As a model for understanding the genetic basis of adaptive evolution in an animal model, we use the small nematode Caenorhabditis elegans. Besides its genetic tractability, use of this organism allows the analysis of evolution at different timescales. For example, experimental evolution can be used to study evolutionary processes in controlled environments on the order of 10–1000 generations (Gray and Cutter, 2014; Teotónio et al., 2017; Penley et al., 2018; Chelo et al., 2019; Saxena et al., 2019; Wernick et al., 2019). For longer timescales, a growing number of isolated and sequenced Caenorhabditis species can be used to study genetic differences responsible for species-level differences (Ting et al., 2018; Yin et al., 2018; Bi et al., 2019; Stevens et al., 2019).

For understanding short-term adaptation, we study two laboratory strains of C. elegans, called N2 and LSJ2, which descended from a single hermaphrodite isolated in 1951 (Figure 1A). These two lineages split from genetically identical populations between 1957 and 1958 and evolved in two very different laboratory environments – N2 grew on agar plates seeded with E. coli bacteria and LSJ2 in liquid cultures containing liver and soy peptone extracts (McGrath et al., 2009; McGrath et al., 2011; Sterken et al., 2015). By the time permanent means of cryopreservation were developed, approximately 300–2000 generations had passed, and ~ 300 new mutations arose and fixed in one of the two lineages (McGrath et al., 2011). Despite their genetic similarity, substantial divergence has occurred between these strains in terms of phenotype and fitness, including a large number of developmental, behavioral, and reproductive traits. Use of these strains allow us to identify causal genetic variants responsible for phenotypic and fitness changes. To date, five de novo, causal genetic variants have been identified in either the N2 or LSJ2 lineage (de Bono and Bargmann, 1998; McGrath et al., 2009; Persson et al., 2009; McGrath et al., 2011; Duveau and Félix, 2012; Large et al., 2016; Large et al., 2017; Zhao et al., 2018).

Figure 1. An N2-derived genetic variant in the intron of nurf-1 increases fitness in laboratory conditions.

(A) History of two laboratory adapted C. elegans strains N2 and LSJ2, which descend from the same individual hermaphrodite isolated in 1951. The N2 and LSJ2 lineage split sometime around 1958. N2 grew on agar plates with E. coli OP50 as a food source for around 11 years until they were cryopreserved. LSJ2 animals were cultured in liquid axenic media containing sheep liver extract and soy extract peptone as a food source for about around 51 years until they were cryopreserved. 302 genetic variations were fixed between these two strains, including two that fall in the nurf-1 gene – WBVar00601361 and WBVar00601565. (B) Genetic location of two nurf-1 variations. WBVar00601361 (in red box) is an N2-derived intron single nucleotide substitution T/A (N2/ancestral) in the 2nd intron of nurf-1. WBVar00601565 is an LSJ2-derived 60 bp deletion in the 3’ end of nurf-1 that removes the last 18 amino acids and part of the 3’-UTR. (C) Comparison of NURF-1 orthologs from Drosophila and humans showing position of protein domains and conserved regions as determined by Blastp and Clustal Omega. (D) Boxplot of pairwise evolutionary fitness differences between the indicated strains measured by directly competing the indicated strains against each other for five generations. PTM288 and PTM229 were created from the N2* and N2 strains, respectively, by a engineering DNA barcode in the dpy-10 gene. PTM417 is the same genotype as ARL(del_LSJ2>N2), with the exception of a background mutation in the spe-9 gene that occurred during the construction of the ARL(del_LSJ2>N2) strain (for details see Methods). This mutation was crossed out of the PTM417 strain and used as a barcode for the digital PCR reaction. The genotype of each nurf-1 allele (shown in B) is indicated by color. The NIL strain also contains LSJ2 alleles of additional linked mutations, which is indicated by the blue horizontal line. (E) Total brood size of the N2 and ARL(intron,LSJ2>N2) strains. (F) Number of differentially expressed genes between synchronized N2 and ARL(intron,LSJ2>N2) animals harvested 52 hr (L4 stage - when spermatogenesis is active) or 60 hr (young adults - when oogenesis is active) after hatching. For all figures, each dot represents an independent replicate, the box indicates the interquartile values of all data, and the line indicates the median of all data. Positive values indicate strain one is more fit than strain two. Negative values indicate strain two is more fit than strain one. For all figures, n.s. indicates p>0.05, one star indicates significant difference at p<0.05 level, two stars indicate significant difference at p<0.01 level, and three stars indicate significant difference at p<0.001 level.

Figure 1—source data 1. Source data for Figure 1.
DOI: 10.7554/eLife.48119.006

Figure 1.

Figure 1—figure supplement 1. Egg-laying rate of four strains.

Figure 1—figure supplement 1.

Egg-laying rate was calculated at the indicated time points. Six L4 animals were picked onto each assay plate (time = 0) for the indicated times. The total number of eggs was counted for each plate and used to calculate the average number of eggs laid per hour for each animal. Each individual trial is shown as a dim line. The mean for each strain is shown as the bold, colored lines. The effect of the intron SNV on this trait was subtle, especially when compared to the difference in reproductive rate caused by the nurf-1 60 bp deletion (ARL(del,LSJ2>N2*)). We speculate this is explained by epistasis between the two nurf-1 mutations, with the LSJ2 combination of both alleles having a non-linear effect on reproductive rate. We are unable to test this hypothesis as we did not construct a double ARL strain containing LSJ2 alleles of both of the nurf-1 mutations due to the difficulty in creating the intron edit (see Methods). Alternatively, additional mutations in NIL(nurf-1,LSJ2>N2*) could contribute to reproductive rate.
Figure 1—figure supplement 1—source data 1. Source data for Figure 1—figure supplement 1.
DOI: 10.7554/eLife.48119.004

Figure 1—figure supplement 2. Transcriptional analysis of N2 and ARL(intron,LSJ2>N2) at 52 and 60 hr.

Figure 1—figure supplement 2.

(A) Multi-dimensional scaling analysis of transcriptional responses indicates that the N2 and ARL(intron,LSJ2>N2) strains show different expression patterns at 52 hr but not 60 hr. (B) Analysis of genes upregulated (top) or downregulated (bottom) at all four strain/conditions (left) and also by tissue-specific expression).

One of these mutations is an LSJ2-derived, 60 bp deletion at the 3’ end the nurf-1 gene that reduces growth rate, slows reproductive output, and prevents development into the dauer diapause state in response to ascaroside pheromones (Figure 1B) (Large et al., 2016). This genetic variant is beneficial in the LSJ2 liquid cultures in which it arose and fixed, but places animals at a disadvantage in the agar plate environments in which N2 evolved, an example of gene-environment interaction (Large et al., 2016). We proposed that nurf-1 is a regulator of life-history tradeoffs. Life history tradeoffs represent competing biological traits requiring large energetic investments, such as the tradeoff between energy required for reproduction versus the energy required for individual survival. The difference in fitness of this allele in the two laboratory environments is potentially determined by how the life-history tradeoffs map into reproductive success.

Studies of nurf-1 and its orthologs provide fundamental support for its role as a life history regulator. nurf-1 encodes an ortholog of mammalian BPTF, a subunit of the NURF chromatin remodeling complex (Barak et al., 2003) (Figure 1C). BPTF encodes a large protein containing a number of domains that facilitate recruitment of NURF to specific regions of the genome for chromatin remodeling (Alkhatib and Landry, 2011), including domains that interact with sequence-specific transcription factors and three PHDs and a bromodomain that facilitate interactions with modified nucleosomes (Li et al., 2006; Wysocka et al., 2006; Kwon et al., 2009; Ruthenburg et al., 2011). Through its DDT domain (Fyodorov and Kadonaga, 2002), BPTF cooperates with ISWI to slide nucleosomes along DNA, changing access of promoter regions to transcription factors that drive gene transcription. In mammals, BPTF regulates cellular differentiation and homeostasis of specific cell-types and tissues, including the distal visceral endoderm (Landry et al., 2008), ecoplacental cone (Goller et al., 2008), hematopoietic stem/progenitor cells (Xu et al., 2018), mammary stem cells (Frey et al., 2017), T-cells (Wu et al., 2016), and melanocytes (Koludrovic et al., 2015). In Drosophila, the ortholog to BPTF, NURF301, regulates the heat shock response, pupation, spermatogenesis, and innate immunity (Badenhorst et al., 2002; Badenhorst et al., 2005; Kwon et al., 2008; Kwon et al., 2009). Many of these traits can be viewed as life-history tradeoffs, e.g. large energetic investments in individual survival through the development of the immune system vs. energetic transfers to offspring in the placenta or mammary glands. The evolution of BPTF/NURF-1 function might also be relevant in human disease. Genetic alterations in BPTF have been reported in tumors, including gene amplification and point mutations (Buganim et al., 2010; Balbás-Martínez et al., 2013). In addition, BPTF has been shown to be required for the transcriptional activity of c-MYC, a major human oncogene (Richart et al., 2016).

In this paper, we continue our studies of the evolution of the N2/LSJ2 laboratory strains. We demonstrate that an independent, beneficial mutation in the nurf-1 gene was fixed in the N2 lineage, suggesting that nurf-1 is a preferred genetic target for laboratory adaptation. To understand why nurf-1 might be targeted, we explored the in vivo role in C. elegans development by taking advantage of CRISPR-Cas9 to test causal relationships that inform laboratory evolution and fitness effects. Our work suggests that the large, full-length isoform of nurf-1, primarily studied in mammals, is dispensable for development. Instead, two, largely non-overlapping isoforms are both essential for reproduction, having opposing effects on cellular differentiation of gametes into sperm or oocytes. Our results suggest that the ability of nurf-1 to regulate life history tradeoffs is the result of exquisite regulation of NURF function through the balance of two competing isoforms, reminiscent of the principle of Yin and Yang. Finally, we demonstrate that these two isoforms have split into separate genes in a clade of related nematodes, potentially resolving transcriptional and functional conflict between the Yin and Yang isoforms transcription and function. Our work demonstrates how evolution of isoforms can precede the origin of a new gene, supporting a role for gene sharing in the origin of functionally novel proteins.

An N2-derived variant in the second intron of nurf-1 increases fitness and brood size in laboratory conditions

We previously mapped differences in a number of traits (including reproductive rate, fecundity, toxin and anthelmintic sensitivity, and laboratory fitness) between N2 and LSJ2 to a QTL centered over nurf-1, which contains a derived mutation in both the N2 and LSJ2 lineages (Figure 1A and B) (Large et al., 2016; Large et al., 2017; Zhao et al., 2019). The LSJ2 allele of nurf-1 contains a 60 bp deletion in the 3’ end of the coding region of the gene, overlapping the stop codon and probably resulting in the translation of parts of the 3’ UTR. The N2 allele of nurf-1 contains an SNV that converts an A to a T in a homopolymer run of Ts in the 2nd intron (Figure 1B). Using CRISPR-Cas9-based genome editing, we previously demonstrated that the LSJ2-derived deletion accounted for a large portion of the trait variance in reproductive rate explained by the QTL. However, it did not explain the entire effect of this locus (Large et al., 2016). We decided to test whether this additional genetic variant or variants affected fitness of the animals in laboratory conditions using a previously described pairwise competition assay (Zhao et al., 2018). To do so, we took advantage of three strains we had previously created; CX12311 is a near isogenic line with ancestral (non-N2) alleles of npr-1 and glb-5 crossed into an otherwise N2 genetic background, which we have used to eliminate the fitness and phenotypic effect of derived (N2) alleles of npr-1 and glb-5 (Zhao et al., 2018) (Figure 1D - referred to as N2*), NIL(nurf-1,LSJ2>N2*) is a near isogenic line containing LSJ2 alleles of both nurf-1 mutations backcrossed into an N2* background (Large et al., 2016), and ARL(del,LSJ2>N2*) is an allelic replacement line containing the LSJ2-derived 60 bp deletion edited into the N2* strain using CRISPR-Cas9. Phenotypic differences between the NIL(nurf-1,LSJ2>N2*) and ARL(del,LSJ2>N2*) strains are caused by the N2-derived intron SNV in nurf-1, or one of the additional seven linked LSJ2-N2 genetic variants near nurf-1.

We measured the relative fitness of the N2*, NIL(nurf-1,LSJ2>N2*), and ARL(del,LSJ2>N2*) strains against PTM288, a version of N2* that also contains a silent mutation in the dpy-10 gene (Figure 1D). The dpy-10 silent mutation provides a common genetic variant that can be used to quantify the relative proportion of each strain on a plate using digital droplet PCR. Both the NIL(nurf-1,LSJ2>N2*) and ARL(del,LSJ2>N2*) strains showed dramatically reduced fitness comparing to PTM288, consistent with our previous report showing that the 60 bp deletion is deleterious on agar plates (Large et al., 2016). However, the NIL(nurf-1,LSJ2>N2*) was quantitatively and significantly less fit than the ARL(del,LSJ2>N2*) strain, suggesting additional genetic variant(s) in the NIL(nurf-1,LSJ2>N2*) strain further reduced its fitness. To confirm this result, we also directly competed the NIL(nurf-1,LSJ2>N2*) and ARL(del,LSJ2>N2*) strains against each other, using a nearly neutral background mutation in spe-9(kah132) to distinguish the two strains (Figure 1D).

To determine if the N2-derived intron SNV in nurf-1 (Figure 1B) was responsible for the fitness gains (as opposed to one of the seven linked LSJ2/N2 variants), we used CRISPR-Cas9 to directly edit the LSJ2 allele of the intron SNV into the standard N2 strain to create a strain we will refer to as ARL(intron,LSJ2>N2). We measured the relative fitness of the ARL(intron,LSJ2>N2) and N2 strains against PTM229 (a strain which again contains a dpy-10 silent mutation). The ARL(intron,LSJ2>N2*) strain was significantly less fit than the N2 strain at a level similar to the difference between the NIL(nurf-1,LSJ2>N2*) and ARL(del,LSJ2>N2) strains (Figure 1D). These results indicate that beneficial alleles of nurf-1 arose in both laboratory lineages - the 60 bp deletion makes LSJ2 animals more fit in liquid, axenic media (Large et al., 2016), and the intron SNV makes N2 animals more fit on agar plates seeded with bacteria.

In C. elegans, brood-size of hermaphrodites is negatively correlated to the timing of initial egg-laying. It has been suggested that this life-history tradeoff has been optimized in N2 animals (Hodgkin and Barnes, 1991; Cutter, 2004). After sexual maturation, gonads in the hermaphroditic sex initially undergo spermatogenesis before transitioning to oogenesis; a concomitant lengthening of spermatogenesis time increases the total brood size of hermaphrodites but also delays when reproduction can start. When we compared the total fecundity produced by the N2 and ARL(intron,LSJ2>N2) strains, we found a significant difference, with the ARL(intron,LSJ2>N2) strain producing ~ 30 fewer offspring than N2 (Figure 1E). This could indicate that spermatogenesis occurs for a longer period of time in N2 animals (to produce more sperm), or, alternatively, indicate that sperm are less likely to fertilize an egg in the ARL(intron,LSJ2>N2) strain. The reproductive rate of the N2 and ARL(intron,LSJ2>N2) strains was largely unchanged throughout their reproductive lifespan (Figure 1—figure supplement 1).

RNA-seq analysis identified transcriptional differences caused by the intron SNV during spermatogenesis, supporting our hypothesis that sperm development is affected by this SNV. We collected RNA from synchronized N2 and ARL(intron,LSJ2>N2) hermaphrodites at two timepoints, 52 and 60 hr after hatching, which occur during spermatogenesis (52 hr) or oogenesis (60 hr). Interestingly, a large number of genes are differentially expressed between the two strains but only during the 52 hr timepoint (3384 genes vs. 25 genes) (Figure 1F, Figure 1—figure supplement 2A, and Supplementary file 1). Inspection of these differentially-expressed genes in a single-cell RNAseq dataset (Cao et al., 2017) demonstrated that although a portion of these 3384 genes are expressed in the germline, these genes are also expressed in additional tissues (Figure 1—figure supplement 2B). Gene ontology analysis suggests that cuticle development and innate immune responses are regulated by nurf-1 (Supplementary file 2) consistent with the role of its orthologs in regulating immunity and melanocyte proliferation in Drosophila and humans (Kwon et al., 2008; Landry et al., 2011; Koludrovic et al., 2015; Wu et al., 2016). The restriction of most of these transcriptional changes to a specific timepoint (i.e. 52 hr) could reflect a specific-role for nurf-1 in regulating genes undergoing short bursts of transcriptional upregulation during this developmental timepoint (e.g. molting), a specific-role for nurf-1 in regulating cell number or activity of specific cell-types that are transiently present during this timepoint (e.g. spermatocytes), or some combination of both. These results suggest that the intron SNV regulates a number of developmental processes including spermatogenesis, molting, and innate immunity.

nurf-1 produces multiple transcripts encoding multiple protein isoforms

Our results suggest that selection acted repeatedly on C. elegans nurf-1 during laboratory growth. The molecular nature of NURF-1, an essential subunit of the NURF chromatin remodeling complex, is surprising for a hotspot gene. In general, chromatin remodelers are thought of as ubiquitously expressed regulators with little variation in different cell types, akin to general function RNA polymerase proteins or ribosomes. Why would genetic perturbation of nurf-1 lead to increased fitness? One potential clue is the complexity of the nurf-1 locus. Previous cDNA analysis of nurf-1 identified four unique transcripts encoding four unique isoforms (Andersen et al., 2006), two of which have been shown to affect different phenotypes (summarized in Table 1).

Table 1. Summary of major nurf-1 transcripts identified in C. elegans.

Name Evidence Size Conservedc Predicted biological role in C. elegansd Other names
Transcripta Proteinb aa kD
nurf-1.a N - 2197 252 M,D None Full-length
nurf-1.b C,N,I W 1621 186 D Reproduction, vulval development N-terminal or NURF-1.A
nurf-1.d C,N,I W 816 92 - Size, dauer, reproduction, axon guidance C-terminal or NURF-1.C
nurf-1.f C,N,I W 581 58 - None NURF-1.E
nurf-1.q N,I - 243 36 - None -

a C indicates full-length cDNA have been isolated for this transcript, N indicates evidence from direct sequencing of RNA or cDNA using Oxford Nanopore reads support this transcript, and I indicates evidence from Illumina short read RNA-seq supports this transcript.

b W indicates evidence for the protein isoform was obtained using western blot.

c M or D indicates an analogous isoform is described in mammals (mice or humans) or Drosophila, respectively.

dPredictions from Andersen et al. (2006), Large et al. (2016), or Mariani et al. (2016).

To identify other transcripts produced by nurf-1 and quantify the relative proportions of each that are produced, we analyzed previously published Illumina short-read (Brunquell et al., 2016) (isolated from synchronized L2 larval animals) and Oxford Nanopore long-read RNA sequencing reads (Roach et al., 2019) (isolated from mixed populations) (Figure 2—figure supplements 12). Our results support many of the conclusions of Andersen et al. (2006) but contain a few surprises. We identified five major transcripts (Figure 2A) - three previously isolated (nurf-1.b, nurf-1.d, and nurf-1.f) but also two newly identified (nurf-1.a and nurf-1.q) (mapping of transcript names used in Andersen et al. are listed in Table 1). nurf-1.a encodes a full-length 2197 amino acid isoform analogous to the primary isoform of BPTF in humans and NURF301 in Drosophila (Figure 1C). Despite the expectation that C. elegans would produce a similar protein, the Oxford Nanopore long-read data are the only evidence supporting its existence. The nurf-1.q transcript is predicted to produce a 243 amino acid unstructured protein. With the exception of the full-length nurf-1.a transcript, the overlap of these transcripts is quite minimal, resulting in predicted isoforms with unique protein domains and functions (Figure 2B).

Figure 2. nurf-1 encodes five major transcripts.

(A) Genomic position of the five nurf-1 transcripts supported by Illumina short read and Oxford Nanopore long reads. Each blue box is an exon. Exon number is indicated on the figure. Dark blue exons (10, 16, and 21) are alternatively spliced, resulting in a 6–9 bp difference in length (see Figure 2—figure supplement 1 for details). Genomic location of the HA and FLAG epitope tag insertion site are shown in black along with their associated allele names. (B) The predicted protein isoforms produced by each of the five major transcripts and along with the domains each isoform contains. Immunoblots only supported translation of the B, D, and F isoforms (see panel D for details). For reference, the spliced nurf-1.a transcript is also shown. (C) Relative expression levels of each transcript, determined by number of Oxford Nanopore reads from a mixed population (top panel) or analysis of Illumina short reads from L2 staged animals using kallisto (bottom reads). tpm = transcripts per million. (D) Western blots of N2 and PTM420 strains. PTM420 contains the HA and FLAG epitope tags shown in panel A. Anti-HA antibody detected a band matching the expected size of the NURF-1.B isoform (arrow). Anti-FLAG antibody detected bands matching the expected size of the NURF-1.D and NURF-1.F isoforms (arrows).

Figure 2.

Figure 2—figure supplement 1. RNA-seq analysis of nurf-1.

Figure 2—figure supplement 1.

Coverage plot of reads from RNA. Note the high expression of the 14th, 15th, and 16th exons, supporting the existence of the nurf-1.q transcript. Reads covering the 23rd exon were also observed, supporting the expression of the nurf-1.f transcript. Zoomed in view of the 10th, 16th, and 21 st exons indicates alternative splicing sites are used at these exons. Clipped reads containing sl1 sequence support transcriptional start sites at the 1 st and 14th exon.
Figure 2—figure supplement 1—source data 1. Source data for Figure 2—figure supplement 1.
DOI: 10.7554/eLife.48119.010
Figure 2—figure supplement 2. nurf-1 encodes multiple transcripts.

Figure 2—figure supplement 2.

(A) Subset of nurf-1 transcripts analyzed in this paper. Each blue box is an exon. Exon number is indicated on the figure. Dark blue exons are alternatively spliced, resulting in a 6–9 bp difference in length. (B) Nanopore sequencing reads aligned to nurf-1. Reads were grouped by the nurf-1 transcripts they support. Dark purple marks are mismatches from the reference sequence.
Figure 2—figure supplement 2—source data 1. Source data for Figure 2—figure supplement 2.
DOI: 10.7554/eLife.48119.012
Figure 2—figure supplement 3. Relative expression levels of each nurf-1 transcript, determined by analysis of Illumina short reads using kallisto (bottom reads).

Figure 2—figure supplement 3.

Strains and timepoints (relative to the L1 stage) are indicated on the x-axis. No significant difference in transcript levels was observed between the N2 and ARL(intron, LSJ2>N2) strains.
Figure 2—figure supplement 4. Identification of alternative BPTF species in human cancer cells.

Figure 2—figure supplement 4.

(A) Reactivity of anti-BPTF antibodies with lysates from human cancer cells. Cell line-specific patterns were observed, with bands of the same mobility being detected across lines and detected with independent antibodies. The findings suggest the occurrence of multiple BPTF-related protein species in human cancer cells. A representative western blot is shown. (B) Gel bands selected for BPTF Mass Spectrometry identification according to the localization of the BPTF signal detected by western blotting in MCF-7 cells. (Left, NP-40 lysis buffer, gel bands A1 and A2; Right, Laemmli buffer, B1 and B2). The detection of low molecular weight species upon direct cell lysis in Laemmli buffer strongly supports the notion that the findings do not result from artifactual proteolysis. (C) Sequence coverage of BPTF protein. Peptides identified by LC-MS/MS are highlighted in color. (D) BPTF peptide intensity (arbitrary units) calculated by MaxQuant in the gel bands A1, A2/B1, B2.

We quantified the relative expression of these five transcripts by either counting the number of Nanopore reads that matched the transcript or by using kallisto (Bray et al., 2016) to predict transcript abundance using Illumina short-read sequencing data (Figure 2C). These predictions qualitatively agreed in transcript ranking of expression strength (although quantitative variation in predictions were observed, reflective of the different technologies or developmental stages of the animals). Surprisingly, the newly described nurf-1.q transcript was the most highly expressed followed by the nurf-1.b transcript, and the nurf-1.a, nurf-1.d and nurf-1.f were expressed at similar lower levels.

Although each of the five major transcripts are transcribed, this result does not necessarily mean they are translated into stable protein products. To facilitate analysis of NURF-1 proteins, we used CRISPR-Cas9 to fuse two distinct epitope tags (HA and 3xFLAG tag) to the endogenous nurf-1 locus, just prior to the stop codons in the 16th and 28th exon, respectively (Figure 2A). Immunoblot analysis supported the expression of the B, D, and F isoforms, but not the A or Q isoforms (Figure 2D). Although larger proteins, such as the A isoform, can be difficult to transfer during immunoblots, the lack of a band matching the small Q isoform suggests the highly expressed nurf-1.q transcript is not translated into protein or the protein is rapidly degraded.

Based upon these results, we speculated that the intronic SNV, which we have shown regulates total fecundity and fitness in laboratory conditions (Figure 1), could specifically alter the expression level of the nurf-1.b transcript. However, analysis of all five nurf-1 transcript levels, using the previously described RNA-seq data on the N2 and ARL(intron,LSJ2>N2) strains, did not reveal any significant expression differences (Figure 2—figure supplement 3). Potentially the effect of this SNV has cell-type specific effects in spermatocytes, however, our data, using RNA collected from the whole animal, does not allow us to test this hypothesis.

We also investigated whether similar isoforms could be expressed in human cells, using western blots on a small panel of human cancer cell lines. Interestingly, besides a band matching the expected size of the canonical full-length isoform, a number of additional bands were observed between 150–250 kD (Figure 2—figure supplement 4A). Using mass-spectrometry, we confirmed the presence of multiple BPTF peptides in the bands detected by western blotting (Figure 2—figure supplement 4), consistent with one or more of these bands representing novel BPTF isoforms. Potentially, these isoforms could play a role in cancer metastasis, although we provide no such evidence here. Despite the presence of these additional bands, the full-length version of BPTF is the most highly expressed isoform (Figure 2—figure supplement 4D), consistent with its importance in mammalian species.

The B and D isoforms are both essential for reproduction and the F isoform modifies the heat shock response

Genetic analysis of nurf-1 primarily relied on two deletion alleles, n4293 and n4295 (Figure 3A) (Andersen et al., 2006). The n4293 allele deletes the first exon and predicted transcriptional start site of the nurf-1.a and nurf-1.b transcripts. The n4295 allele deletes three exons of the nurf-1.a, nurf-1.d, and nurf.1.f transcripts that encode a C-terminal PHD domain (Figure 3—figure supplement 1) necessary for human BPTF function. Comparison of the phenotypes of the n4293 and n4295 homozygotes leads to the model that the B isoform is essential for reproduction and the A, D, and/or F isoforms have subtle effects on growth rate and reproductive rate (Table 1).

Figure 3. An additional isoform besides NURF-1.

B is necessary for reproduction in C. elegans. (A) Genomic positions of nurf-1 classical deletion alleles and nine engineered stop codons created using CRISPR/Cas9 based gene editing. Each allele is color-coded by the reproductive ability of homozygous strains. Green is statistically indistinguishable from wild-type, yellow indicates slightly reduced brood size and change in reproductive rate, red indicates substantially reduced brood size in the first generation and eventual sterility after multiple generations of homozygosity, and dark red indicates sterility in the first generation of homozygosity. (B) Fecundity of indicated strains (shown in x-axis of panel) (C) Normalized body area of the indicated strains. Normalized body area was calculated by thresholding video recordings of each strain to segment individual animals and registered throughout each frame of the video. Each dot represents the average area of a single worm, normalized to the N2 data. For red or dark red strains (panel A), measurements were taken on animals homozygous for a single generation. (D) Predicted amino acid change of engineered stop codons and classical alleles on the NURF-1 isoforms. The kah11 mutation only affects the F isoform.

Figure 3—source data 1. Source data for Figure 3.
DOI: 10.7554/eLife.48119.019

Figure 3.

Figure 3—figure supplement 1. Histone recognition domains in NURF-1.D are not essential for its activity.

Figure 3—figure supplement 1.

(A) Genomic position of CRISPR/Cas9 edits to modify conserved amino acids required for recognition of modified histone tails. (B) Partial protein alignment of the two C-terminal PHD domains and the bromodomain showing the conserved tryptophan (W) and aspagine amino acids (N). In other species, the W to E change impairs the PHD domain’s ability to bind H3K4me3 and the N to A change impairs the bromodomains ability to bind H4K16Ac. (C) Normalized body size of indicated strains. n4295 is a deletion allele predicted to delete both PHD domains and create a frame shift in the bromodomain. The 2PHDs strain contains both the 1,926 WtoE and 1,986 WtoE edits. The three domains strain contains all three edits shown in B. All animals were normalized to N2. (D) Egg laying rate of indicated strains at 36–42 hr and 60–66 hr after the L4 larval stage.
Figure 3—figure supplement 2. Reproductive output of indicated strains at indicated times.

Figure 3—figure supplement 2.

Six L4 animals from each strain were synchronized and placed on assay plates for the indicated timepoints. Progeny from each plate were counted to calculate the average reproductive output. Light colored lines indicate each replicate and solid colored lines show the average egg laying rate of each strain. For statistical significance, N2 vs. kah68 comparison is in green and N2 vs. n4295 comparison is in blue.
Figure 3—figure supplement 3. Fecundity analysis of the indicated alleles of nurf-1.

Figure 3—figure supplement 3.

kah96 and kah113 are CRISPR-induced STOP codon replacement mutations in exon 18 of nurf-1. Despite their identical nucleotide changes, in C. elegans nomenclature, they are given unique allele names to indicate their origin from independent CRISPR-Cas9 experiments.

To further delineate the biological role of each isoform, we used CRISPR-Cas9 to engineer nine stop codons in eight exons of the nurf-1 gene: the first, second (two positions), 7th, 15th, 18th, 19th, 23rd, or 26th exons (Figure 3A). The predicted effects of these stop codons on each major isoform are shown in Figure 2—figure supplement 4 and Table 2. Homozygous animals for each mutation were assayed for total brood size and growth rate. Analysis of the phenotypes of these mutants indicated that our working model was incorrect. Instead, we propose that both the B and D isoforms are essential for reproduction.

Table 2. Predicted effect of stop codon mutations on NURF-1 isoforms.

Isoform kah90 kah91 kah106 kah142 kah93 kah96 kah99 kah11 kah68 Length
NURF-1.A 5R 107P 147E 646A 1548G 1632T 1685Q, 1689P, 1693N 2056T 2197
NURF-1.B 5R 107P 147E 646A 1548G - - - 1621
NURF-1.D - - - 170G 254T 307Q, 311P, 315N 675T 816
NURF-1.F - - - - - - 3L 440T 581
NURF-1.Q - - - 170G - - - 243

As expected, engineering stop codons in the first, second, and 7th exons greatly reduced fecundity, resulting in either sterility, or a mortal germline phenotype, initially reducing total brood size of animals, before eventually causing complete sterility after around three-to-five generations of homozygosity (Figure 3B and C). Although the qualitative phenotypes of these four alleles agreed, we observed interesting quantitative differences between them. The second stop codon in the second exon (kah106) and the stop codon in the 7th exon (kah142) reduced growth and fecundity more than the first exon stop codon (kah90) or the first stop codon in second exon (kah91) (Figure 3B and C). We suspect this result indicates the presence of an internal ribosome entry site in the middle of the second exon at the 122nd Methionine, causing the expression of two isoforms from a single transcript. The reduced severity of the first two stop codon alleles can be explained by their inability to affect the protein sequence of the second isoform. An alternative possibility is a difference in frequency of translational read-through of each stop codon, which are interpreted as sense codons at a low frequency (Jungreis et al., 2011).

Unexpectedly, engineering stop codons in the 18th and 19th exons also caused a mortal germline phenotype (kah96 and kah99) (Figure 3B). This result was surprising, because the n4295 allele, predicted to be a loss-of-function allele for the D and F isoforms due to the loss of the PHD and bromodomains, does not have a mortal germline phenotype. We excluded a number of potential explanations for this discrepancy. A suppressor for the n4295 allele could have fixed during the construction of this strain. However, the kah68 allele, which contains a stop codon within the n4295 deleted region, phenocopies the n4295 allele and not the kah96 and kah99 animals (Figure 3B and C, and Figure 3—figure supplement 2)). Another possibility is that the D isoform suppresses the F isoform; loss of both isoforms (in the n4295 background) is tolerated, but loss of just the D isoform (in the kah96 or kah99 backgrounds) allows the F isoform to prevent reproduction. However, we could exclude this possibility as the double mutant containing both the n4295 allele and the 18th exon stop allele phenocopied the kah96 single mutant (Figure 3—figure supplement 3). Additionally, specific loss of the F isoform by the 23rd exon stop allele (kah11) did not affect the phenotype of animals (Figure 3B and C). Our data suggest that, unlike human BPTF, the ability of NURF-1 to bind modified histones is not required for its function. We further confirmed this hypothesis by editing conserved residues in these the PHD and bromodomains necessary for recognition of the H3K4me3 and H4K16ac marks (Figure 3—figure supplement 1).

The most parsimonious explanation of our data is that either the A or D isoform is essential for reproduction in C. elegans. Compound heterozygote tests allowed us to distinguish between these possibilities, indicating that the D isoform is required for reproduction and wild-type growth rate, and the A isoform is dispensable for reproduction and development (Figure 4). We first verified that the kah93, kah96, and kah106 alleles were recessive by measuring the fecundity of heterozygous animals (Figure 4B). Next, we examined the fecundity of kah106/kah96 compound heterozygotes, which are predicted to lack only the A isoform, due to the production a single unaffected copy of the B isoform from the kah96 haplotype and a single unaffected copy of the D isoform from the kah106 haplotype. If the A isoform was essential for reproduction, we would expect these compound heterozygotes to be sterile or have severe defects in fecundity. However, these animals were indistinguishable from wild-type, suggesting that the full-length A isoform is not essential (Figure 4B). The kah106/kah93 compound heterozygotes showed similar results. These animals are predicted to encode one unaffected copy of the D isoform, one truncated copy of the B isoform, and zero unaffected copies of the A isoform. These animals were mostly wild-type, with a small reduction in total fecundity (Figure 4B). We interpret this to mean that the A isoform is not essential and the truncation of the B isoform slightly perturbs its function, causing a slight reduction in fecundity. Finally, we analyzed kah93/kah96 compound heterozygotes. These animals are predicted to encode zero wild-type copies of the D isoform, one wild-type copy of the B isoform, and zero wild-type copies of the A isoform. These animals were essentially sterile. Taken together, we conclude that the B and the D isoform are both essential for reproduction.

Figure 4. Genetic analysis suggests the NURF-1.

B and NURF-1.D isoforms are essential for reproduction in C. elegans. (A) Genomic positions of stop codon or classical deletion mutations used for compound heterozygote or transgenic rescue analysis of B and C. kah3 is a CRISPR/Cas9 genomic edit of the LSJ2-derived 60 bp deletion. (B) Fecundity of homozygote (red), heterozygote (green), and compound heterozygote mutants (yellow) as indicated in the x-axis. The table below the x-axis is the predicted effect of each mutant strain on the indicated nurf-1 isoforms. The number in the table indicates the number of functional copies. The star indicates the milder predicted effect of kah93 on NURF-1.B, as it only truncates 73 of 1621 amino acids. The y-axis shows the fecundity for each strain. (C) Fecundity of indicated strains with and without the presence of an integrated nurf-1.d transgene. The genetic background is also indicated. N2* contains ancestral introgressions of the npr-1 and glb-5 genes.

Figure 4—source data 1. Source data for Figure 4.
DOI: 10.7554/eLife.48119.024

Figure 4.

Figure 4—figure supplement 1. Egg-laying rate of n4295 and ARL(del, LSJ2>N2*) transgenic nurf-1.d cDNA rescue.

Figure 4—figure supplement 1.

Egg-laying rate was calculated at the indicated time points. Six L4 animals were picked onto each assay plates (time = 0) for the indicated times. The total number of eggs was counted for each plate and was used to calculate the average number of eggs laid per hour for each animal. Each individual trial is shown as a dim line. The mean for each strain is shown as the bold, colored lines.
Figure 4—figure supplement 2. Heat shock specifically upregulates NURF-1.F.

Figure 4—figure supplement 2.

(A) Coverage of RNA-seq reads of control and heat shocked C. elegans animals aligned to the nurf-1 genomic location. These data were taken from Brunquell et al. (2016). The x-axis shows the genomic location where the sequencing reads mapped, including the location of nurf-1 transcripts. We also show the position a precise deletion of the 23rd exon edited into two strains, created using CRISPR/Cas9. In C. elegans genetic nomenclature, each independently generated genetic mutation is given a unique allele name, even if they are genetically identical. The deletion edited into the N2 strain is named kah149. The deletion edited into a strain containing an in-frame FLAG epitope tag (shown as a black box) is named kah144. The y-axis indicates the number of reads for at each location. (B) Quantification of RNA abundance for five nurf-1 transcripts in response to heat shock. Data taken from Li et al. (2016), who heat shocked L2 animals at 34°C for 30 min and Brunquell et al. (2016), who heat shocked L4 animals at 33°C for 30 min. The y-axis is the estimated transcripts per million (tpm) for each isoform in each condition. (C) Western blotting of a strain containing the FLAG-tag fused at the position shown in panel A using an anti-FLAG antibody. We detected two bands, one matching the predicted size of the NURF-1.F isoform, that were both upregulated by heat shock (34°C). (D) Western blotting of three strains either containing a FLAG-tag and/or deletion allele predicted to ablate the nurf-1.f transcript. The x-axis shows the presence or absence of the various alleles along with the environmental condition. We detected two bands that were induced by heat shock. Observation of these bands required the FLAG epitope tag and could be ablated by the 23rd exon deletion. (E) Multi-dimensional scaling plot (MDS) of the N2 (red) and a strain carrying the kah149 deletion of the 23rd exon (blue) in response to various heat shock conditions. No HS indicates no heat shock, 2 hr or 4 hr HS indicates two or four hours of heat shock at 34°C, and 2 hr HS + recovery indicates animals experiencing two hours of heat shock at 34°C followed by 0.5 hr of recovery at 20°C. The overall transcriptional response was the same in both strains. (F) Number of genes significantly up or down regulated between N2 and kah144 animals at the indicated conditions. (G) Scatter plot of all genes differentially expressed in four hour heat shock or two hour heat shock + recovery conditions. The R2 value was 0.4421.

To confirm that the D isoform is essential, we also created a transgenic strain containing an integrated construct driving a nurf-1.d cDNA from its endogenous promoter. This transgene could fully rescue the fecundity phenotype of the kah96 allele and partially rescue the fecundity phenotype of the kah93 allele (Figure 4C). This transgene could also rescue the reproductive timing and fecundity changes of the n4295 allele and the LSJ2-derived 60 bp deletion (kah3) (Figure 4C and Figure 4—figure supplement 1). As expected, this transgene could not rescue the kah106 allele, which creates a stop codon in the B isoform. These data further support a requirement of both the B and D isoforms for reproduction.

Although the F isoform does not seem to have an effect in normally developing animals, it is involved in the heat shock response. Multiple reports have demonstrated that nurf-1 is upregulated in response to heat shock (Brunquell et al., 2016; Li et al., 2016). By analyzing RNA-seq reads from these two papers, we found that the nurf-1.f transcript was specifically upregulated in both datasets, with increased coverage of the 23rd exon as well as the 24th through 28th exons (Figure 4—figure supplement 2A and B). We confirmed that the increased transcription of the nurf-1.f transcript also increased NURF-1.F protein abundance (Figure 4—figure supplement 2C and D). Transcriptional analysis of strains lacking the F isoform indicated that the initial transcriptional response to heat shock was largely the same, but the long-term transcriptional response of a subset of genes was affected (Figure 4—figure supplement 2E–G). We conclude that the F isoform is specifically up-regulated by heat shock and plays a modulatory role in determining the long-term transcriptional response to heat shock.

The B and D isoforms have opposite effects on cell fate during gametogenesis

Although the B and D isoforms are both required for reproduction, the molecular mechanism that these isoforms operate through could be different. One possibility is that the long-form of NURF-1 has split into two subunits - both isoforms participate as part of the NURF complex, cooperating together to regulate reproduction. However, the D isoform might instead modify NURF activity by competing for binding with transcription factors or regions of the genome to which NURF is recruited. A third possibility is that the D isoform acts through a NURF-independent pathway.

To gain insights into the molecular nature of the D isoform, we decided to determine precisely how the B and D isoforms regulate reproduction, using three nurf-1 stop alleles (Figure 5A). For hermaphrodites to produce a fertilized egg, the gonads must produce both male and female gametes at different developmental times (Figure 5B). Initially, gametogenesis produces sperm, creating approximately 300 sperm at which point a permanent sperm-to-oocyte switch occurs. From this time, gametogenesis produces oocytes until the animal dies or the gonad ceases to function (Hubbard and Greenstein, 2005). A number of defects could cause sterility – inability to form gametes, inability to create sperm, inability to create oocytes, or defects in the sperm and/or oocyte function. We used DAPI staining to characterize the production of sperm and oocytes in three nurf-1 mutants (Figure 5C and D). We first tested kah106 mutants, which lack the B isoform (Figure 5A), for the ability to produce sperm. Compared with N2 animals, which create ~ 300 sperm per animal, the number of sperm produced by kah106 animals was greatly reduced, resulting in the production of only approximately 60 sperm (Figure 5D). These animals produced a normal number of oocytes, indicating that spermatogenesis seemed to be affected specifically (Figure 5E). We interpret these data as evidence that hermaphrodites that lack the NURF-1.B isoform spend less time in spermatogenesis before transitioning to oogenesis. We next tested kah96 mutants which lack the D isoform. These animals produced approximately 500 sperm and almost no oocytes (Figure 5C-E). We interpret these data as evidence that hermaphrodites that lack the D isoform are unable to transition from spermatogenesis to oogenesis. Finally, we performed similar experiments on kah93 mutants, which lack the D isoform and have a truncated B isoform. These animals showed an intermediate phenotype, with normal number of sperm but reduced number of oocytes (Figure 5D and E). The reduced activity of the B isoform due to its truncation potentially allows other factors to transition the animals to oogenesis, resulting in the milder defects found in the kah93 animals (Figure 3B).

Figure 5. NURF-1.B and NURF-1.D have opposite effects on the sperm-to-oocyte switch in hermaphrodites.

(A) Genomic position of the previously-described stop codon mutants used in B and C. (B) Summary of gametogenesis of C. elegans. Animals undergo spermatogenesis during the late L4 and then transition to oogenesis stage during maturation to adulthood. The number of sperm produced during spermatogenesis can be determined by counting sperm in the spermatheca when oogenesis has begun. (C) Representative fluorescence images of one spermatheca for DAPI stained young adult animals. Each tiny dot represents the condensed chromosomes of a single sperm. (D) Sperm number of indicated strains. L4 animals were synchronized and allowed to develop for an additional 12 hr. DAPI staining was used to identify and count the number of sperm in each animal. Each dot represents a single animal. (E) Oocyte number of indicated strains. L4 animals were synchronized and allowed to develop for an additional 12 hr. DAPI staining was used to identify and count the number of oocytes in each animal.

Figure 5—source data 1. Source data for Figure 5.
DOI: 10.7554/eLife.48119.027

Figure 5.

Figure 5—figure supplement 1. Transcriptome analysis of strains containing N2/LSJ2 genetic variation linked to nurf-1.

Figure 5—figure supplement 1.

(A) Multi-dimensional scaling plot (MDS) of CX12311 (N2*), ARL(del,LSJ2>N2*) (PTM88), NIL(nurf-1,LSJ2>N2*) (PTM66) and LSJ2. The x-axis and y-axis are two dimensions used to separate samples from different biological conditions based upon the transcriptional change between different samples. (B) Scatter plot of all genes detected in RNA sequencing. The x-axis is the log2 of the relative expression changes of each gene in N2* vs. ARL(del,LSJ2>N2) (indicating transcriptional responses induced by the 60 bp deletion). The y-axis is the log2 of the relative expression of each gene in NIL(nurf-1,LSJ2>N2*) vs. ARL(del,LSJ2>N2*) (indicating transcriptional responses induced by other mutations linked to the 60 bp deletion including the LSJ2-derived intron SNV). The transcriptional changes are positively correlated with an R2 value of 0.63, indicating the genetic variation regulates a common subset of genes. fog-3 gene is shown in red.

Although animals that lack either the B or D isoform are unable to reproduce, the cause of sterility is different at the cellular level. To further study the molecular effects of perturbing nurf-1 function, we transcriptionally profiled adult N2*, NIL(nurf-1,LSJ2>N2*), ARL(del, LSJ2, N2*), and LSJ2 animals, which contain various combinations of the N2 and LSJ2-derived nurf-1 mutations (Supplementary file 1). A multi-dimensional scaling plot indicated that the N2* and ARLdel replicates formed two unique clusters, and the LSJ2 and NILnurf-1 replicates largely overlapped in a third cluster (Figure 5—figure supplement 1A). The genetic variation surrounding the nurf-1 locus is responsible for the majority of transcriptional differences between adult LSJ2 and N2* animals, suggesting most of the fixed variants do not have a dramatic effect on transcription on N2-like growth conditions. Although the LSJ2-derived 60 bp deletion regulates transcription, additional genetic variation in the NILnurf-1 strain, presumably from the N2-derived intron variant, also regulates transcription in adult animals.

To study the effects of the 60 bp deletion and intron SNV on transcription, we focused on two comparisons: 1) the N2* vs ARL(del, LSJ2>N2*), which will identify transcriptional changes caused by the 60 bp deletion and 2) the NIL(nurf-1, LSJ2>N2*) vs ARL(del, LSJ2>N2*), which will identify transcriptional changes caused by the intron SNV (as well as linked mutations in the NIL other than the 60 bp deletion). We expect that the latter comparison will mostly report the changes of the intron SNV, as it accounts for most of the fitness differences between the two strains. We observed a positive correlation between these two comparisons (Figure 5—figure supplement 1B). The most parsimonious explanation for this observation is that both the N2 and LSJ2-derived alleles in nurf-1 regulate the activity of a common molecular target, which is likely to be the NURF complex.

A duplication in a sister clade of Caenorhabditis species creates two separate nurf-1 genes

Previous work in C. briggsae characterized the role of nurf-1 in reproduction, including the isolation of nurf-1 cDNAs in this species (Chen et al., 2014). Interestingly, although transcripts matching the nurf-1.b and nurf-1.d were isolated from this species, they no longer shared any exons with each other, suggesting that they were expressed from two separate genes (Figure 6A). Further, spliced leader sequences to the 5’ end of both transcripts matched sl1 sequence, suggesting that these two genes were not expressed as a single operon (Blumenthal, 2012). We compared the gene products using BLAST and found that the shared exons in C. elegans had duplicated in C. briggsae, with one set of each retained in each of the new genes (Figure 6A). Short-read transcriptomics data for this species matched the cDNA analysis; we found evidence for transcripts matching nurf-1.b, nurf-1.d, and nurf-1.f (Figure 6—figure supplements 13). Unlike C. elegans, C. briggsae seemed to have lost both the nurf-1.a and nurf-1.q transcripts.

Figure 6. A duplication of the shared exons of the nurf-1.b and nurf-1.d transcripts resulted in the split of nurf-1 it into two separate genes in a subclade of Caenorhabditis species.

(A) Comparison of two species with different versions of nurf-1. In C. elegans, nurf-1.b and nurf-1.d overlaps in the 14th and 15th exon (shown in orange). In C.briggsae, a duplication of the orange exons resulted in separation of nurf-1.b and nurf-1.d into separate genes. C. briggsae has also lost expression of the nurf-1.a and nurf-1.q transcripts. (B) Distribution of the two versions of nurf-1 (shown in panel A) in 32 Caenorhabditis species. Red indicates the C. elegans version, blue indicates the C. briggsae version, and black indicates a nurf-1 version that could not be determined. The species phylogeny suggests that a duplication event occurred in the common ancestor of the brenneri/tribulationis clade. (C) The most well supported topology of the duplicated region is consistent with a single duplication event. Orange indicates protein sequence from the duplicated region in the nurf-1–1 gene, and turquoise indicates protein sequence from the duplicated region in the nurf-1–2 gene. Non-colored branches indicate unduplicated nurf-1 sequence. The rate of amino acid substitution in the nurf-1–1 duplicated region has also increased, as seen in the branch lengths. Scale is in substitutions per site.

Figure 6—source data 1. Source data for Figure 6.
DOI: 10.7554/eLife.48119.035

Figure 6.

Figure 6—figure supplement 1. nurf-1 isoform structure for 22 Caenorhabditis species.

Figure 6—figure supplement 1.

From the phylogenic tree of 32 Caenorhabditis species, we determined the nurf-1 gene structure of 22 species using genome and transcriptome information. Species with one nurf-1 gene (in red) are consistent with expression of five transcripts orthologous to C. elegans: nurf-1.a, nurf-1.b, nurf-1.q, nurf-1.d and nurf-1.f. Species with two nurf-1 genes (in blue) contain duplicated sequence and transcripts matching C. elegans nurf-1.b (nurf-1–1), nurf-1.d (nurf-1–2.d) and nurf-1.f (nurf-1–2 .f). Exons corresponding to the duplicated region were labeled in orange.
Figure 6—figure supplement 2. Sashimi plots for Caenorhabditis species with one nurf-1 gene.

Figure 6—figure supplement 2.

Only species with a published genome and transcriptome were plotted. Each peak shows the coverage for each exon, each trajectory shows exon-exon junctions supported by RNA-seq reads.
Figure 6—figure supplement 3. Sashimi plots for Caenorhabditis species with two nurf-1 genes.

Figure 6—figure supplement 3.

Only species with a published genome and transcriptome were plotted. Each peak shows the coverage for each exon, each trajectory shows exon-exon junctions supported by RNA-seq reads. These plots show that no reads support the splicing from nurf-1–1 to nurf-1–2, which further suggests the split of nurf-1 in these species.
Figure 6—figure supplement 4. Five hypothetical topologies related to the timing and number of duplication events involved in the nurf-1 gene split.

Figure 6—figure supplement 4.

Only those five topologies with the lowest log-likelihoods are shown. logL: log-likelihoods, AU: p-values of the approximately unbiased test. Orange circles indicate duplication events. Trees 3–5 were rejected by the AU test and are highlighted in red. Analyses were performed using IQ-TREE with the JTT substitution model with gamma-distributed rate variation among sites.
Figure 6—figure supplement 5. Maximum likelihood tree of the B isoform and nurf-1–1.

Figure 6—figure supplement 5.

The duplicated region was removed from the alignment prior to inference. Estimated using IQ-TREE using the JTT substitution model with gamma-distributed rate variation among sites. Bootstrap support values are indicated as labels on branches. Scale is in amino acid substitutions per site.
Figure 6—figure supplement 6. Maximum likelihood tree of the duplicated region of nurf-1 in 22 species.

Figure 6—figure supplement 6.

Estimated using IQ-TREE using the JTT substitution model with gamma-distributed rate variation among sites. Bootstrap support values are indicated as labels on branches. Scale is in amino acid substitutions per site.

Analysis of the nurf-1 gene structure within the context of the Caenorhabditis phylogeny suggested that the exon duplication and separation of nurf-1 into separate genes occurred at the base of a clade containing 11 described species, including C. brenneri and C. tribulationis (Figure 6B). We determined the nurf-1 gene structure in 22 of the 32 Caenorhabditis species with published genomes and transcriptomes (Kiontke et al., 2011; Stevens et al., 2019) (Figure 6—figure supplements 13). Like C. briggsae, the species in the brenneri/tribulationis clade express a transcript matching nurf-1.b from a single gene (which we call nurf-1–1). These species also express two transcripts matching nurf-1.d and nurf-1.f from a second gene, called nurf-1–2. Analysis of the spliced leader sequence of the 5’ end of the nurf-1.d transcript only identified sl1 sequence, consistent with separation of these genes into distinct transcriptional units. None of these species appears to express nurf-1.a or nurf-1.q transcripts (Figure 6—figure supplements 13). RNA-seq data for species outside of this clade (Figure 6—figure supplements 13) matched the transcription pattern of C. elegans, suggesting that these species express five major transcripts from a single nurf-1 gene: nurf-1.a, nurf-1.b, nurf-1.d, nurf-1.f, and nurf-1.q. These data suggest that the C. elegans transcript structure is ancestral.

Phylogenetic analysis of the duplicated ~ 200 amino acid sequence was used to evaluate different hypotheses surrounding the timing and number of duplication events. The analysis supported the model that the split of nurf-1 into two distinct genes happened once within the common ancestor of the brenneri/tribulationis clade (Figure 6C – additional possible trees shown in Figure 6—figure supplement 4). The topology recovered for the region of nurf-1 outside the duplication is congruent with the species tree (Figure 6—figure supplement 5). Interestingly, the rate of amino acid substitution in the duplicated region was accelerated in nurf-1–1 relative to nurf-1–2 (p<0.001; Welch's t-test) suggesting that this region experienced positive selection and/or relaxed selection after this duplication event occurred. Comparison of the synonymous vs. non-synonymous substitution rate in three closely-related species pairs was also consistent with an increase in the rate of protein evolution in the duplicated region following the separation of nurf-1 into independent genes (Table 3).

Table 3. dN/dS ratio in three Caenorhabditis species pairs.

nurf-1–1 or nurf-1.b nurf-1–2 or nurf-1.d
Sp. pair Duplicationa Dup. Reg.b Otherc Ratiod Dup. Reg.b Otherc Ratiod
C. afra/
C. sulstoni
N 0.136e 0.121 1.1 0.116e 0.072 1.6
C. nigoni/
C. briggsae
Y 0.249 0.085 2.9 0.111 0.019 5.8
C. remanei/
C. latens
Y 0.295 0.121 2.4 0.177 0.048 3.7

a Duplication indicates whether the species pair contain the duplicated exons that create two nurf-1 genes.

b Dup. Reg. indicates dN/dS was calculated using the region of the alignment that contains the duplication.

c Other indicates dN/dS was calculated using the region of the alignment that does not contain the duplication.

d Ratio was calculated by dividing the dN/dS value of the Dup. Reg. by the Other.

e The dN/dS values for the nurf-1.b and nurf-1.d in the duplicated region were different due to the b transcript encoding two additional amino acids in the 14th exon (before the M initiation codon in the d isoform) and the amino acids encoded by the 16th alternatively spliced exon.

Discussion

In this paper, we make use of CRISPR/Cas9-enabled gene editing to characterize the nurf-1 gene in C. elegans and then study the sequence and expression of nurf-1 orthologs in other Caenorhabditis species. The combination of genetics and evolutionary analysis allowed us to make a number of surprising observations. First, we show that an SNV in the 2nd intron of nurf-1 that fixed in the N2 laboratory strain increases the fitness and fecundity of the N2 strain. Second, we show that the full-length isoform of nurf-1 has split into two essential, mostly non-overlapping isoforms with opposite effects on cell fate in differentiating gametes. Finally, we show that the B and D isoforms have split into separate genes in a subset of Caenorhabditis species. These data show that nurf-1 can be genetically perturbed to increase fitness of animals in new environments and has experienced long-term evolutionary changes that have split its function and regulation into two isoforms/genes (Figure 7A and B).

Figure 7. Proposed antagonistic (Yin-Yang) working model of two nurf-1 isoforms in C. elegans.

Figure 7.

(A) Descriptive phylogeny with proposed major transitions in nurf-1 isoform evolution. Each dot indicates the timepoint of a major nurf-1 isoform evolution event. (B) Proposed molecular mechanism for NURF-1 isoforms. The NURF-1.B isoform interacts with ISWI through its DDT domain to form a NURF complex capable of remodeling chromatin at specific regions of the chromosome. NURF is recruited to these regions through interactions with specific transcription factors using protein domains encoded by the overlapping exons. This remodeling is necessary for transcriptional responses for spermatogenesis. Due to some unknown signal, after spermatogenesis has resulted in the production of ~ 300 sperm, the NURF-1 D isoform outcompetes the NURF complex away from its target loci, causing the loss of transcription of key spermatogenesis genes, resulting in gametogenesis transitioning from spermatogenesis to oogenesis. The binding affinity of PHD domains and bromodomainto histone strengthens this repression, but they are not completely necessary for the ability of the D isoform to outcompete the B isoform.

Evolution of NURF-1/BPTF across phyla

In humans and Mus musculus, an abundance of evidence confirms that the long-form isoform of BPTF, which is orthologous to nurf-1, is the primary isoform in the NURF chromatin remodeling complex (Alkhatib and Landry, 2011). While a subset of BPTF exons are alternatively spliced, these events will not lead to the large changes in size we observe in the nurf-1 gene. One exception is the FAC1 isoform, which encompasses 801 N-terminal amino acids of BPTF (Bowser et al., 1995). While FAC1 is found in amyloid Alzheimer’s patients and enriched in the nervous system (Bowser et al., 1995; Landry et al., 2008), a biological role for this isoform has not been described. FAC1 is smaller and lacks conserved protein sequence found in the B isoform of nurf-1, suggesting an independent evolutionary origins and function.

In Drosophila, an intermediate state between humans and nematodes is found. Two major isoforms of NURF301 (the ortholog to nurf-1) have been described: a full-length form of NURF301 analogous to the full-length mammalian BPTF and an N-terminal form of NURF301 analogous to the NURF-1.B isoform of C. elegans. Both isoforms form NURF complexes and regulate gene expression (Kwon et al., 2009). Genetic analysis suggests that full-length NURF301 is required for gametogenesis in both sexes while the N-terminal isoform is required for regulation of pupation and innate immunity.

Nematodes have retained the N-terminal isoform but seem to have lost use of the full-length isoform for most biological traits (Andersen et al., 2006). Instead, they express two C-terminal isoforms (D and F) that appear to be a recent evolutionary innovation, likely occurring before the origin of the Caenorhabditis lineage. We show that the D isoform (or the Yin isoform) is essential in C. elegans, and seems to act in opposition to the B isoform (or the Yang isoform) to regulate the sexual fate of differentiating gametes. The requirement of two antagonistic isoforms (the B and D) for reproduction is reminiscent of the principle of Yin and Yang. Genetic pathways often include both positive and negative regulators of transcription and ultimate phenotype, however, rarely are both the factors encoded by the same genetic locus. While there is growing appreciation of isoform-specific regulation of many genes, nurf-1 appears to be unusually complex in this regard (although not unprecedented – see Müller and Basler, 2000; Berry et al., 2001; Wang et al., 2009).

We propose a molecular mechanism to explain the actions of the B and D isoforms to regulate transcription (Figure 7B). These two isoforms share 207 amino acids of protein sequence, which falls in a region that is thought to facilitate physical interactions with transcription factors (Alkhatib and Landry, 2011). NURF-1.B participates as part of the NURF complex, which is recruited to certain promoters by binding to transcription factors. At these loci, NURF promotes or represses expression of target genes by remodeling the chromatin surrounding promoters and gene bodies. For unknown molecular reasons, NURF-1.D preferentially binds to these transcription factors, displacing the NURF complex from these genomic regions, causing a change of chromatin state and gene expression.

Microevolution of NURF-1/BPTF

We showed that independent, beneficial alleles in nurf-1 were fixed in two laboratory strains of C. elegans that each experienced an extreme shift in environment from their natural habitats. The N2-derived SNV results in the change of a run of homopolymers in the 2nd intron of the nurf-1.b transcript. Such a change could act as an enhancer for the nurf-1.d promoter, but the nature of the genetic change and position is more consistent with a role in regulating the nurf-1.b transcript. Analysis of RNA sequencing data did not identify any obvious changes in levels of any of the nurf-1 transcripts and it is unclear by what molecular mechanism it regulates nurf-1 activity. Potentially, it could increase pausing of the RNA polymerase at the homopolymer run or could regulate RNA splicing by changing the secondary structure of the RNA molecule. In general, such a mutation would not be predicted by most bioinformatic approaches to have a phenotypic effect. Only the low genetic diversity between the LSJ2 and N2 strains allowed us to focus on this variant, and eventually demonstrate this particular variant is causal.

The probability of two beneficial mutations happening in both lineages by random chance is quite small. Less than 300 genes (out of ~ 20,000 total) harbor derived mutations in either the N2 or LSJ2 lineage (McGrath et al., 2011). Only a handful of these fixed mutations are expected to be beneficial; our recent QTL mapping of fitness differences on agar plates only identified the nurf-1 locus (Zhao et al., 2019) and the small effective population sizes (~4–100) are expected to lead to the fixation of a number of nearly-neutral mutations through genetic drift and draft. Our work suggests nurf-1 is a genetic target for adaptation to the extreme changes in environments associated with laboratory growth.

Targeting of nurf-1 is consistent with its role as a regulator of life history tradeoffs. Many traits influence individual and offspring survival; however, the mapping of these traits onto fitness is thought to be dependent on the environmental niche an organism occupies. The LSJ2-derived deletion in nurf-1 modified life history tradeoffs to prioritize individual survival over reproduction; by shunting energy away from reproduction and growth, they increased their chances of surviving on poor, unnatural food. N2 animals grew on agar plates seeded with E. coli bacteria, which they can readily consume and metabolize into a useful energy source. In these conditions, survival is not the primary concern; each animal has three days to eat as much food as possible and produce as many progenies as possible to maximize the probability one of their offspring is transferred to the new food source. It is reasonable to think that the N2 and LSJ2 laboratory conditions represent opposite extremes along a life history axis encompassing individual survival and reproduction. The N2 mutation favors reproduction while the LSJ2 mutation favors survival.

In humans, genetic alterations in BPTF have been reported in several types of cancer and a role of BPTF in transcriptional regulation by c-MYC has been demonstrated, in agreement with its chromatin-binding function (Richart et al., 2016). Using well-characterized and validated antibodies against BPTF, we found several molecular species with unexpected electrophoretic mobilities in human cancer cells (Figure 2—figure supplement 4). Using mass-spectrometry, we confirmed the presence of multiple BPTF peptides in the bands detected by western blotting (Figure 2—figure supplement 4). These findings raise the possibility that these protein sequence variants have non-canonical functions. Given that stress adaptation is a hallmark of cancer - allowing tumor cells to survive and evolve following Darwinian selection processes - and the role of nurf-1 in C. elegans demonstrated here, it is tempting to speculate a role for such diversity of isoforms in the life histories of cancer cells. However, our work simply shows that additional forms of BPTF exist. Whether they have a biological role still needs to be determined.

Split of nurf-1 into separate genes potentially resolves conflict between the Yin and Yang isoforms caused by shared exons

In a clade of Caenorhabditis nematodes, the nurf-1 gene has split into two separate genes, an example of gene birth resulting in the duplication of a portion of the nurf-1 gene. Multigene families are common in most species and protein domains are often shuffled between genes. While the importance of gene duplication is not controversial, the exact mutational events and evolutionary forces responsible for the fixation of independent genes with different functions is less understood. Here we seem to have uncovered an example of how gene sharing, specifically through the creation of unique isoforms, can contribute to this process. In the lineage that led to the C. elegans species, nurf-1 first evolved changes in isoform use, resulting in the creation and essential action of the nurf-1.d transcript, and the loss of essentiality of the long nurf-1.a transcript. In this case, partitioning of the biological function and protein domains in each nurf-1 isoform created diversification of protein products.

What are the evolutionary forces responsible for the split of nurf-1 into two genes? One possibility is developmental system drift. Under this scenario, the separation of the two isoforms into two distinct genes does not signify any important evolutionary change in the function of the two genes. Neutral processes are responsible for the initial fixation of the duplication and the change does not provide any future evolutionary benefit.

However, there are a few additional possible ways adaptive evolution could play a role. First, correlated with the separation of the Yin and Yang transcripts into two genes is the loss of the full-length nurf-1.a and nurf-1.q transcripts. Loss of these transcripts could have provided a fitness benefit for animals. Consider, in order to produce both the nurf-1.b and nurf-1.d transcripts (i.e. the Yin and Yang transcripts) in the same cell, there must be a mechanism to distinguish between transcripts containing the 1st to 15th exons (the nurf-1.b transcripts) and transcripts initiating from the 14th exon (the nurf-1.d transcripts). In the former case, the 15th exon is spliced to the 16th exon to terminate the transcript. In the latter case, the 15th exon is spliced to the 17th exon, along with the remaining 3' exons. Alternatively, the cell might not distinguish between transcripts, but uses each alternative splice site at a constant ratio (i.e. 80% of the time, the 15th exon is spliced to the 16th exon and 20% of the time, the 15th exon is spliced to the 17th exon). In the latter scenario, two additional transcripts must be produced. Intriguingly, these two transcripts match nurf-1.a and nurf-1.q, suggesting these transcripts are non-functional biproducts of molecular conflict between nurf-1.b and nurf-1.d. Potentially, production of the nurf-1.a and nurf-1.q transcripts could come at an energetic cost.

Multiple lines of evidence are consistent with the nurf-1.a and nurf-1.q transcripts playing non-biological roles. First, while the nurf-1.q transcript is produced at high levels, we were unable to observe its product in our immunoblots, suggesting that it is either not translated or the protein product is rapidly degraded. Second, our genetic tests were unable to identify a biological role for nurf-1.a. Third, we observe a loss of both the nurf-1.a and nurf-1.q transcripts in the species that have split nurf-1 into two genes. It would have been quite easy for these species to retain expression of nurf-1.q in their current configuration, either through a promoter in front of the 14th exon in the nurf-1–1 gene, or an alternative stop exon after the 2nd exon of the nurf-1–2 gene, since both of these elements existed in the ancestral state.

Second, duplication of the shared exons could facilitate future evolutionary change. Escape from adaptive conflict is a mechanism by which gene duplication can resolve the situation where a single gene is selected to perform multiple roles (Des Marais and Rausher, 2008). After duplication, each copy is free to improve its function independently. As organisms evolve, recruitment of NURF to specific loci could be accomplished by changing its binding with specific transcription factors through amino acid changes in NURF-1. The most rapidly evolving portion of the protein is within the 14th and 15th exons, suggesting positive selection acts on this region of the protein, potentially changing the transcription factors NURF-1 binds to. One issue that arises in species containing a single nurf-1 gene is the pleiotropy of genetic changes in the shared region; changing the amino acid sequence of the B isoform also changes the D isoform. Are there situations where modifying one isoform but not the other is preferred? In the clade of nematodes that have duplicated nurf-1, each gene is free to evolve independently. We present evidence that in these species, the duplicated region is free to evolve more rapidly. It should be interesting to characterize the exact function of this duplicated region and determine if these changes in protein sequence facilitate changes in transcription factor binding in an adaptive manner.

Conclusion

A fundamental problem in evolutionary biology is understanding the genetic mechanisms responsible for phenotypic diversity in extant species. Here, we present one route to address this problem. Experimental evolution and genetic analysis can be used to identify evolutionary relevant genes and understand their function. This knowledge can be leveraged to understand patterns of evolution of these genes in other species. We believe that merging genetics, genomics, and molecular evolution is a powerful approach to understand the evolutionary mechanisms responsible for long-term adaptation and species level differences.

Materials and methods

Key resources table.

Reagent type
or resource
Designation Source of reference Identifiers Additional
information
Gene (C. elegans) nurf-1 WormBase Wormbase ID: WBGene00009180 Sequence: F26H11.2
Gene (human) BPTF National Center for Biotechnology Information Gene ID: 2186
Strain, strain background (E. coli) OP50 Caenorhabditis Genetics Center (CGC) RRID:
WB-STRAIN:OP50
Strain (C. elegans) CX12311 PMID: 21849976 RRID:
WB-STRAIN:CX12311
Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM66 PMID: 21849976 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM88 PMID: 21849976 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM288 PMID: 30328811 RRID:
WB-STRAIN:PTM288
Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM229 PMID: 30328811 RRID:WB-STRAIN:PTM229 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM98 This paper RRID:WB-STRAIN:PTM98 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM113 This paper RRID:WB-STRAIN:PTM113 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM116 This paper RRID:WB-STRAIN:PTM116 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM117 This paper RRID:WB-STRAIN:PTM117 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM118 This paper RRID:WB-STRAIN:PTM118 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM167 This paper RRID:WB-STRAIN:PTM167 Strain Background:
N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM170 This paper RRID:WB-STRAIN:PTM170 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM189 This paper RRID:WB-STRAIN:PTM189 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM203 This paper RRID:WB-STRAIN:PTM203 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM211 This paper RRID:WB-STRAIN:PTM211 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM316 This paper RRID:WB-STRAIN:PTM316 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM317 This paper RRID:WB-STRAIN:PTM317 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM319 This paper RRID:WB-STRAIN:PTM319 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM322 This paper RRID:WB-STRAIN:PTM322 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM325 This paper RRID:WB-STRAIN:PTM325 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM332 This paper RRID:WB-STRAIN:PTM332 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM354 This paper RRID:
WB-STRAIN:PTM354
Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM371 This paper RRID:WB-STRAIN:PTM371 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM372 This paper RRID:WB-STRAIN:PTM372 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM373 This paper RRID:WB-STRAIN:PTM373 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM376 This paper RRID:
WB-STRAIN:PTM376
Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM416 This paper RRID:WB-STRAIN:PTM416 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM417 This paper RRID:WB-STRAIN:PTM417 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM420 This paper RRID:WB-STRAIN:PTM420 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM487 This paper RRID:WB-STRAIN:PTM487 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM489 This paper RRID:WB-STRAIN:PTM489 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM512 This paper RRID:WB-STRAIN:PTM512 Strain Background: N2, Request a strain: please email the corresponding author
Strain (C. elegans) PTM517 This paper RRID:WB-STRAIN:PTM517 Strain Background: N2, Request a strain: please email the corresponding author
Cell line (Human) Colo-205 American Type Culture Collection (Rockville, MD)
Cell line (Human) MCF-7 American Type Culture Collection (Rockville, MD)
Cell line (Human) MDA-MB-231 American Type Culture Collection (Rockville, MD)
Cell line (Human) Hela American Type Culture Collection (Rockville, MD)
Cell line (Human) A549 G. Roncador, CNIO
Sequence-based reagents (Plasmid) Plasmid: pSM Cori Bargmann Lab (Rockefeller University)
Sequence-based
reagents (Plasmid)
Plasmid: pDD162 PrU6::dpy-10_sgRNA PMID: 27467070 CRISPR/Cas9 gene editing
Sequence-based reagents (Plasmid) Plasmid: pDD162 Preft3::Cas9 PMID: 27467070 CRISPR/Cas9 gene editing
Sequence-based reagents (Plasmid) Plasmid: pCFJ90 Addgene http://www.wormbuilder.org/test-page/about-mossci/
Sequence-based reagents (Plasmid) Plasmid: pCFJ104 Addgene http://www.wormbuilder.org/test-page/about-mossci/
Sequence-based reagents (Plasmid) Plasmid: pCFJ151 Addgene http://www.wormbuilder.org/test-page/about-mossci/
Sequence-based reagents (Plasmid) Plasmid: pCFJ601 Addgene http://www.wormbuilder.org/test-page/about-mossci/
Antibody (mouse monoclonal) anti-HA Life Technologies Cat. No.: 326700 (1:500)
Antibody (mouse monoclonal) anti-DYKDDDDK Life Technologies Cat. No.: MA191878 (1:1000)
Antibody (mouse monoclonal) anti-FLAG Millipore Sigma Cat. No.: F3165 (1:1000)
Antibody Horseradish peroxidase-conjugated secondary antibodies Dako Glostrup (1:10000)
Peptide, recombinant protein BPTF Novus Biologicals Cat. No.: NB100-41418
Peptide, recombinant protein Vinculin Sigma Cat. No.: V9131
Sequence-based reagents (Oligonucleotide) dpy-10 (cn64) PMID: 25161212 CRISPR/Cas9 gene editing
Commercial assay, kit Taqman probe:dpy-10 (kah82/kah83) Thermal: Custom TaqMan SNP Genotyping Assays PTM09
Commercial assay, kit NEB Q5 site directed mutagenesis kit NEB Cat. No.: E0554
Commercial assay, kit Next Ultra II Directional RNA Library Prep Kit NEB Cat. No.: E7760S
Commercial assay, kit Zymo DNA isolation kit Zymo Cat. No.: D4071
Commercial assay, kit Zymo DNA cleanup kit Zymo Cat. No.: D4064
Commercial assay, kit ddPCR Supermix for Probes BIORAD Cat. No.: 1863010
Commercial assay, kit Droplet Generation Oils BIORAD Cat. No.: 1863005
Commercial assay, kit ddPCR Droplet Reader Oil BIORAD Cat. No.: 1863004
Commercial assay, kit VECTASHIELD antifade Mounting Medium with DAPI VECTOR Cat. No.: H-1200
Software, Algorithm edgeR PMID: 19910308 RRID:SCR_012802 Opensource: https://bioconductor.org/packages/release/bioc/html/edgeR.html
Software, Algorithm SARtools PMID: 27280887 RRID:SCR_016533 Opensource: https://github.com/PF2-pasteur-fr/SARTools
Software, Algorithm IGV PMID: 21221095 RRID:SCR_011793 https://software.broadinstitute.org/software/igv/
Software, Algorithm Kallisto PMID: 27043002 RRID:SCR_016582 https://pachterlab.github.io/kallisto/
Software, Algorithm HISAT2 PMID: 25751142 RRID:SCR_015530 https://ccb.jhu.edu/software/hisat2/index.shtml
Software, Algorithm Samtools PMID: 19505943 RRID:SCR_002105 http://samtools.sourceforge.net/
Software, Algorithm Jalview PMID: 19151095 RRID:SCR_006459 http://www.jalview.org/
Software, Algorithm MAFFT PMID: 23329690 RRID:SCR_011811 https://mafft.cbrc.jp/alignment/software/
Software, Algorithm IQ-Tree PMID: 25371430 RRID:SCR_017254 http://www.iqtree.org
Software, Algorithm ITOL PMID: 27095192 https://itol.embl.de/
Software, Algorithm PAL2NAL PMID: 16845082

Strains

The following strains were used in this study:

Near isogenic lines (NILs)

  • CX12311 (N2*): kyIR1(V, CB4856 > N2), qgIR1(X, CB4856 > N2),

  • PTM66 (NIL(nurf-1,LSJ2>N2*)): kyIR87(II, LSJ2 > N2); kyIR1(V, CB4856 > N2), qgIR1(X, CB4856 > N2)

  • CRISPR-generated allelic replacement lines (ARLs)

  • PTM88 (ARLdel, LSJ2>N2): kyIR1(V, CB4856 > N2); qgIR1(X, CB4856 > N2); nurf-1(kah3)II; spe-9(kah132)I

  • PTM416 (ARLintron,LSJ2>N2): nurf-1(kah127)II

  • PTM417: kyIR1(V, CB4856 > N2); qgIR1(X, CB4856 > N2); nurf-1(kah3)II

CRISPR-generated barcoded strains

  • PTM229: dpy-10(kah82)II

  • PTM288: kyIR1(V, CB4856 > N2); qgIR1(X, CB4856 > N2); dpy-10(kah82)II

CRISPR-generated epitope-tagged strain

  • PTM420 (HA-FLAG): nurf-1(kah124,kah133)II,

CRISPR-generated STOP codons replacement lines

  • PTM98 (exon23): nurf-1(kah11)II

  • PTM203 (exon26): nurf-1(kah68)II

  • PTM316 (exon 1): nurf-1(kah90)II/oxTi924 II

  • PTM317 (exon 2): nurf-1(kah91)II/oxTi924 II

  • PTM319 (exon 15): nurf-1(kah93)II/oxTi924 II

  • PTM322 (exon 18): nurf-1(kah96)II/oxTi924 II

  • PTM325 (exon 19): nurf-1(kah99)II/oxTi924 II

  • PTM332 (exon 2): nurf-1(kah106) II/oxTi924 II

  • PTM487 (exon 7): nurf-1(kah142) II/oxTi721 II

CRISPR-generated domain replacement lines

  • PTM113 (PHD1): nurf-1(kah16)II,

  • PTM116 (PHD2): nurf-1(kah19)II,

  • PTM117 (PHD2): nurf-1(kah20)II,

  • PTM118 (Bromodomain): nurf-1(kah21)II,

  • PTM167 (Bromodomain): nurf-1(kah32)II,

  • PTM170 (double PHD): nurf-1(kah19,kah36)II,

  • PTM189 (three domains): nurf-1(kah19,kah36,kah54)II,

  • PTM211 (double PHD): nurf-1(kah66,kah73)II

MosSCI transgenic strains

  • PTM371: nurf-1(kah93) II/oxTi721 II; kahSi7,

  • PTM372: nurf-1(kah96) II/oxTi721 II; kahSi7,

  • PTM373: nurf-1(kah99) II/oxTi721 II; kahSi7,

  • PTM376: nurf-1(n4295) II; kahSi7,

  • PTM517: kyIR1 (V, CB4856 > N2); qgIR1 (X, CB4856 > N2); nurf-1(kah3) II; kahSi7

CRISPR-generated deletion strains:

  • PTM512 (23rd exon deletion): nurf-1(kah149) II

  • PTM489 (HA-FLAG + 23rd exon deletion): nurf-1(kah124,kah133,kah144)II

Other double mutants:

  • PTM354: nurf-1(n4295, kah113) II/oxTi924 II

Strain construction

Previously described strains

CX12311, PTM66, and PTM88 were all previously described (McGrath et al., 2011; Large et al., 2016).

CRISPR-generated allelic replacement lines (ARLs)

We used the coCRISPR protocol to generate all CRISPR-edited lines using single-strand oligonucleotides to make precise edits (Arribere et al., 2014; Paix et al., 2015).

Resequencing of the PTM88 strain identified a number of background mutations, including an A to G missense SNV that is predicted to change an asparagine to an aspartic acid which we named kah132. The flanking sequence of this mutation is 5’-cgacaatgac[a]atcgccaggg-3’. We backcrossed out this spe-9(kah132) mutation, along with additional background mutations, to create PTM417.

To create PTM416, we designed a number of guide RNAs nearby the intron SNV. However, we were unable to identify editing events using these guide RNAs, putatively due to the high usage of As and Ts.

We turned to a two-step strategy to create the edit, first creating a deletion of the 2nd intron along with flanking exon regions using guide RNAs with high predicted efficiency. We created the following constructs driving the following sgRNAs:

5’- TCGATAATTATCCGTTTGT(GGG) −3’,

5’- TTGCATCATATCCCACAAA(CGG) - 3’,

5’- ACGGTAGCTCATGAAGAGA(AGG) −3’ and 5’- TTCCGACGAATATAAGAAA(CGG) −3’

We also ordered an oligonucleotide repair:

5’-GTCTGTTAGAGATGCTATTAATGTCGATAATTATCgctaccataggcaccacgagcgagATTCGTCGGAATTTAAGAAACTTGTGAATAATGTT −3’

We injected 50 ng/μl of Peft-3::Cas9, 25 ng/μl of dpy-10 sgRNA, 500 nM dpy-10(cn64) repair oligo, 10 ng/μl of each of the nurf-1 sgRNAs listed above, and 500 nM of the repair oligonucleotide into CX12311 animals.

Jackpot broods were identified and roller animals were genotyped using the following primers along with the BanI restriction enzyme:

5’- GCAGGCCGGCCTTCGCGCCTGGGTAATACC −3’ and

5’- CGGCAGTTTTCGTCGTTCTG −3’

A single heterozygote worm was identified. Wild-type heterozygote progeny were identified (to remove the linked dpy-10 mutation) and this mutation was balanced (homozygous animals were sterile) with an integrated GFP marker near the nurf-1 gene (oxTi924). This strains was frozen with the following genotype: PTM366 nurf-1(kah125)/oxTi924 II; kyIR1 (V, CB4856 > N2); qgIR1 (X, CB4856 > N2) X.

For the second step, we crossed PTM366 to PTM66 animals and selected non-fluorescing animals to create nurf-1(kah125)/kyIR87(II, LSJ2 > N2); kyIR1 (V, CB4856 > N2); qgIR1 (X, CB4856 > N2) X compound heterozygote animals. We used the following sgRNAs to specifically target the nurf-1(kah125) homologous chromosome:

5’- ATctcgctcgtggtgccta(tgg) −3’ and 5’- TTCCGACGAATctcgctcg(tgg) −3’

The 2nd homologous chromosome, containing the kyIR87 introgression was used as a repair construct. We injected 50 ng/μl Pelt-3::Cas9, 10 ng/μl dpy-10 sgRNA, 500 nM dpy-10(cn64) repair oligo and 25 ng/μl of each nurf-1 sgRNA. Roller animals were then PCR genotyped to screen for animals that were homozygous for the LSJ2 allele at the intron and heterozygote for the 60 bp deletion.

After screening, the target genotype was made homozygous. This strain was named PTM410 kyIR1 (V, CB4856 > N2); qgIR1 (X, CB4856 > N2); nurf-1(kah127)II. PTM416 was created by backcrossing the PTM410 strain to the N2 background using an RFP fluorescent nurf-1 balancer (oxTi721) strain for four generations. We genotyped the npr-1 and glb-5 sites to verify that PTM416 did not carry the introgressions surrounding these genes.

CRISPR-generated isotope-tagged lines

To create the PTM420 epitope-tagged strain the following guide RNA and repair oligo was used to first add an HA epitope tag into the 16th exon:

5’-TGGCACTTGCTCAGTTGTGG-3’

5’-TTTTGTCAAATTTGGAGCCGTTTGGGGAACCTCTAggcgtagtcggggacgtcgtatgggtatcctcctcctcctcctcccTGcTGtTCgTCTGGgACcTGCTCgGTTGTaGTaGAAACTGCGAAACCAGTCGCGTCATCAGGCATGTC-3’

The following injection mix was used: 50 ng/μl Peft-3::Cas9, 10 ng/μl dpy-10 sgRNA, 500 nM dpy-10(cn64) repair oligo, 25 ng/μl of sgRNA, and 500 nM repair oligonucleotide.

We next added a 3xFLAG tag to the C-terminal of nurf-1 gene using purified Cas9 protein (IDT, Catalog #1074181) and in vitro synthesized RNAs (Synthego) using a modified protocol (Prior et al., 2017). The injection mix was prepared as follows: 2 μM dpy-10 sgRNA (RNA scaffold 5’- GCUACCAUAGGCACCACGAG −3’ + tracrRNA) and 4 μM of two sgRNAs that targeted this region (RNA scaffold: 5’- CUCAUAAGUUCGCAUCCAG −3’+ tracrRNA, 5’- UUCGGAUCAGCUGUUGCCAC −3’+ tracrRNA) were mixed and incubated in a thermocycler at 95°C for five minutes, then 2.5 μg/ul Cas9 protein was added and incubated at room temperature for five minutes. Finally, 0.2 μM dpy-10 repair oligo and 0.5 μM FLAG repair oligo were added to mix and incubate at room temperature for 60 min. This mix was injected into the HA-tagged strain to create the double epitope tagged line.

CRISPR-generated STOP codon replacement lines, PHD/bromodomain replacement lines, and deletion lines

The following injection mix was used to create each of these strains: 50 ng/μl Peft-3::Cas9, 10 ng/μl dpy-10 sgRNA, 500 nM dpy-10(cn64) repair oligo, 25 ng/μl of sgRNA, and 500 nM repair oligonucleotide. For each strain/allele, each of the specific sgRNAs and repair oligos used to construct it are listed in Supplementary file 3. To facilitate the genotyping process, some of the repair oligos for STOP codon replacement sites contain restriction sites that will alter some of the amino acids, exact changes are listed in Supplementary file 4. In C. elegans nomenclature, Identical edits must be given different allele names if they were isolated independently.

For mutants that were sterile (or lead to sterility), we balanced these mutations using a GFP (oxTi924) or mCherry (oxTi721) integrated marker near nurf-1.

MosSCI transgenic strains

MosSCI strain construction was done following standard protocol from Frøkjær-Jensen et. Al (Frøkjær-Jensen, 2015). Injection mix was prepared as following: 38 ng/ul pCFJ601 (Mos1 transposase), 30 ng/ul pCFJ151 - Pnurf-1.d::nurf-1.d-SL2-GFP (insertion vector with homologous arms), 2.5 ng/ul pCFJ90 (Pmyo-2::mCherry), 5 ng/ul pCFJ104). This was injected into EG6699 uncoordinated animals. Three injected animals were placed on a single plate at 30°C to facilitate starvation. After 5 days, coordinated animals with GFP fluorescence and no red fluorescence were singled to new NGM plates and allowed to proliferate. Their progenies were singled and a single homozygote without uncoordinated offspring was maintained. This homozygote was then backcrossed to N2 for four generations to remove unc-119(ed3) III to create the PTM337 strain containing the integrated rescue construct. This strain was then crossed to a variety of nurf-1 alleles using standard protocols.

Cell culture

The following human cancer cell lines were used: Colo-205 (colorectal), MCF-7 and MDA-MB-231 (breast), and HeLa (cervix) were obtained from the American Type Culture Collection (Rockville, MD); A549 (lung) was kindly provided by G. Roncador, CNIO. Cells were authenticated using STR profiling, tested for mycoplasma contamination and negative. Cells were cultured in DMEM (Sigma-Aldrich) supplemented with 10% FBS (HyClone, Logan, UT, USA), except for A549 which were cultured in RPMI (Sigma-Aldrich) supplemented with 10% FBS and sodium pyruvate (Thermo Scientific).

Molecular biology

All sgRNAs were constructed using NEB Q5 site directed mutagenesis kit (E0554) using primers

5’- [unique sgRNA protospacer sequence] + GTTTTAGAGCTAGAAATAGCAAGT −3’ and

5’- CAAGACATCTCGCAATAGG −3’ to modify a vector backbone containing a subclone of pDD163 containing the U6 promoter to drive sgRNAs in germline1.

To create the pCFJ151 - Pnurf-1.d::nurf-1.d-sl2-GFP plasmid, a nurf-1.d cDNA was isolated from reverse transcribed RNA using primers containing NheI restriction sites. This PCR product was then digested and ligated to a pSM vector. A 2890 bp long promoter region immediately upstream of the nurf-1.d isoform was amplified with a forward primer including FseI and a reverse primer including AscI restriction sites. This PCR product was then digested and ligated into the vector constructed in step 1. Third, an SL2-GFP sequence from was cut and ligated into the new vector using KpnI and SpeI restriction sites. Finally, this entire sequence containing the promoter, cDNA and sl2::GFP sequence was inserted into the pCFJ151 vector using NEB Q5 site directed mutagenesis kit.

Nematode growth conditions

The animals were cultured on 6 cm standard nematode growth medium (NGM) plates containing 2% agar seeded with 200 μl of an overnight culture of the E. coli strain OP50. Growth temperature was controlled using a 20°C incubator. Strains were grown for at least three generations without starvation before any experiments was conducted.

nurf-1 conserved regions

The predicted protein sequence for the NURF-1.A protein isoform was BLAST-searched against human or Drosophila melanogaster protein databases using NCBI blastp (McGinnis and Madden, 2004). Regions with alignment scores above 50 were annotated as homologous regions. These homologous regions were further verified through multiple sequence alignmentwith Clustal Omega program (Chojnacki et al., 2017).

Competition experiment

Competition experiments were performed as described previously (Zhao et al., 2018).

RNA-seq analysis

RNA-seq samples for comparing the effect of the nurf-1 intron SNV

N2 and PTM416 worms were synchronized using a 3 hr hatch-off. Worms were observed every hour after 46 hr until the majority were in the L4 stage (which occurred at 48 hr). Four hours later, worms were collected and kept frozen in −80°C freezer until RNA extraction for the 52 hr timepoint. Eight hours later, young adult animals were collected and kept frozen in the −80°C freezer until RNA extraction for the 60 hr timepoint.

RNA-seq samples for comparing effect of the two derived nurf-1 mutations

CX12311, PTM66, PTM88, LSJ2 L4 hermaphrodites were picked to fresh NGM agar plates. Their adult progeny were bleached using alkaline-bleach solution to isolate eggs for synchronization. The eggs were washed with M9 buffer for three times and placed on a tube roller overnight. About 400 hatched L1 animals were placed on NGM agar plates and incubated at 20°C until they reach young adulthood, as determined by when eggs were observed on assay plates. These worms were then harvested, washed 3 times with M9 buffer, and frozen in a −80°C freezer for later processing.

RNA-seq samples for heat shock

N2 and PTM416 worms were synchronized using a 3 hr hatch-off. Eggs were cultured at 20°C until they reached L4 stage. Heat shock assay plates were then wrapped with parafilm and placed in a water bath pre-heated to 34°C for 2 hr or 4 hr. Worms were either collected right after heat shock or after 30 min at 20°C for the recovery group.

For each of the above experiments, RNA was isolated using Trizol. The RNA libraries were prepared using an NEBNext Ultra II Directional RNA Library Prep Kit (E7760S) following its standard protocol. The libraries were sequenced by an Illumina NextSeq 500. The reads were aligned by HISAT2 using default parameters for pair-end sequencing (Kim et al., 2015). These aligned reads were then visualized in IGV browser (Robinson et al., 2011) to examine nurf-1 splice junction track (as shown in Figure 2—figure supplement 1). Transcript abundance was calculated using featureCount and then used as inputs for the SARTools. SARTools use edgeR for normalization and gene-level differential analysis (Varet et al., 2016) and output the multidimensional scaling plot for each transcriptome analysis project. Differentially expressed genes were determined for comparisons have adjusted p-value<0.05. Genes upregulated and downregulated are plotted separately for the tissue and stage analysis. Each gene was normalized by dividing the sum of its expression level across all stages and this normalized table was used for hierarchical clustering analysis. Sequencing reads were uploaded to the SRA under PRJNA526473.

Kallisto was used to quantify abundances of nurf-1 transcripts (Bray et al., 2016). We first created our own reference transcriptome by modifying the transcripts in Wormbase published reference transcriptome to restrict our analysis to the nurf-1.a, nurf-1.b, nurf-1.d, nurf-1.f and nurf-1.q isoforms. Alternative splicing sites in the 10th, 16th, and 21st exons were also removed from this reference database to ensure they were consistent between all isoforms. We used wildtype L2 RNA-seq data from Brunquell et. al to quantify wildtype nurf-1 abundance (Brunquell et al., 2016) and extracted tpm(transcripts per million) data from Kallisto output abundance table. We used RNA-seq data from PRJNA311958 and PRJNA321853 (Brunquell et al., 2016) (Li et al., 2016) to quantify the heat shock response of nurf-1 isoforms in Figure 4—figure supplement 2B.

Western blot

4 N2 and PTM420 gravid hermaphrodites were picked to fresh 5.5 cm NGM agar plates. Worms were collected just prior to starvation using M9 buffer and stored at −80°C until protein extraction. At least 4 plates of worms were used for each protein isolation. Worms were condensed by centrifugation and 2x sample buffer (100 mM Tris-HCl pH 6.8M, 200 mM dithiothreitol, 4% SDS, 0.2% Bromophenol Blue, 20% glycerol) was added in 1:1 w/v ratio. 1 μl of 500 mM EDTA and 1 μl of Halt protease inhibitor cocktail (100x) (Catalog number: 78430) were added for every 100 ng of worm sample. The protein sample was vortexed for 90 s and incubated on ice for about 1 min. Samples were then sonicated in a Bransonic 0.5 gallon ultrasonic bath filled with hot water > 80°C for 10 min and immediately placed on ice for 2 min. We then boiled the samples for 5 min and placed on ice to cool down. The sample was centrifuged at 12,000 rpm for 5 min and the supernatant was transferred to new tubes.

All samples were loaded on 5% SDS-PAGE gel at 3 μl, 5 μl and 7 μl volumes followed by Coomassie blue staining and washing steps. Gels were then dried using DryEase Mini-Gel Drying System (Invitrogen, Catalog number: NI2387). These gels were used to normalize protein loading volume for different samples.

Each sample was loaded onto a freshly made 6% or 10% SDS-PAGE gel and run at 25 mA. Gel samples were then transferred in 10 mM CAPS pH 10.5 buffer at 20 V and 20 mA for 17 hr to a PVDF membrane. Protein products with HA tag were detected using 1:500 anti-HA antibody (Life Technologies, Catalog number: 326700), NURF-1.D isoform with FLAG tag was detected using 1:1000 PIERCE ANTI-DYKDDDDK antibody (Life Technologies, Catalog number: MA191878) and NURF-1.F isoform with FLAG tag was detected using 1:1000 Millipore ANTI-FLAG antibody (Millipore Sigma, Catalog number: F3165).

For western blots of cancer cell lines, cells were lysed in 1% NP-40 buffer supplemented with protease and phosphatase inhibitors. Following sonication, clearing by centrifugation, and protein quantification, samples (100 μg) were subjected to electrophoresis in NuPAGE 3–8% Tris-acetate precast polyacrylamide gels (Thermo Scientific). Samples were run under reducing conditions and then transferred to nitrocellulose membranes, which were blocked with TBST, 5% skim milk. Membranes were incubated with primary antibodies detecting the following proteins: BPTF (NB100-41418, Novus Biologicals) (1:1,000) and Vinculin (V9131-2ML, Sigma-Aldrich) (1:10,000). This was followed by incubation with horseradish peroxidase-conjugated secondary antibodies (Dako, Glostrup, Denmark) (1:10,000). Reactions were detected using an ECL detection system and Bio-Rad ChemiDoc MP Imaging System (Hercules, CA, USA).

Egg-laying analysis

Egg laying assays were performed as previously described (Large et al., 2016). All egg-laying assays were carried out at 20°C using standard 3 cm NGM plates seeded with the OP50 strain of Escherichia coli. OP50 were prepared freshly by streaking a glycerol stock of OP50 on an LB plate and letting grow at 37°C overnight. A single colony was then picked to 5 ml fresh LB and cultured overnight in a shaking incubator at 200 rpm. 1 ml of the overnight culture was used to inoculate 200 ml of LB for 4–6 hr of growth at 37°C with shaking. The 200 ml OP50 culture was concentrated via centrifugation to an OD600 of 2.0 and this culture was used for seeding experimental plates with 50 μl aliquots. All experimental plates were prepared the week of the assay and left at 22.5°C 18–24 hr following seeding. Plates were then placed at 4°C until the day of the assay and warmed to 20°C for 12 hr before each time point.

For strains that have severe reduced fertility when homozygous, one L4 nematode was transferred to the 50 μl experimental plate. The number of eggs laid were measured every 12 or 24 hr, and eggs laid per hour was calculated by dividing the time range and number of animals left on each plate at each timepoint. At least 10 replicates were assayed for each strain.

For other strains, six fourth larval stage (L4) nematode was transferred to the 50 μl experimental plate. The number of eggs laid were measured every 12 or 24 hr, and eggs laid per hour was calculated by dividing the time range and number of animals left on each plate at each timepoint. Six replicates were assayed for each strain.

Fecundity was calculated by summing up all eggs laid for each worm.

Analysis of growth rate using body sizes

For strains with mutations in PHD or bromodomains, growth analysis were performed as previously described (Large et al., 2016). For other strains, video recordings were analyzed similarly, with the exception that each animal was registered between each video frame and used to calculate an average area for each individual worm. For strains that were balanced with fluorescent markers, only non-fluorescent worms were picked for video tracking.

Sperm and oocyte counting analysis

4 N2, PTM332, PTM319 and PTM332 gravid hermaphrodites were picked to fresh 5.5 cm NGM agar plates. After 3 days, 20–30 non-fluorescent L4 worms were picked to a new NGM plate and let grow at 20°C for 12 hr. Worms were then picked to a drop of M9 buffer on a Fisher Superfrost Plus slide (22-037-246). Fixation was done through applying 95% ethanol for three times. A drop of Vector Laboratories Vectashield Mounting Medium with DAPI (H-1500) was added and a coverslip was applied and sealed with nail polish. Z-stack images were captured through a moving-stage Olympus IX73 microscope under 40x objective. Oocytes were counted while imaging and sperm number was measured manually by analyzing z-stack images on ImageJ through the CellCounter plugin.

Genomic and transcriptomic analysis of nurf-1 in additional Caenorhabditis species

To identify nurf-1 orthologs, we used homology information included in www.wormbase.org or by BLAST-searching C. elegans protein sequences against protein data provided by the Caenorhabditis genome project (http://blast.caenorhabditis.org). Genomic regions that contain the identified nurf-1 orthologs and related gff3 annotation data were downloaded from download.caenorhabditis.org or the WormBase public FTP site (data from Stein et al., 2003) (Mortazavi et al., 2010; Fierst et al., 2015; Slos et al., 2017; Kanzaki et al., 2018; Yin et al., 2018; Lamelza et al., 2019). Species with public RNA-seq data were identified in the SRA database. These reads were downloaded and aligned to corresponding nurf-1 DNA reference sequence for each species using HISAT2 and further manipulated using SAMTOOLS (Li et al., 2009; Kim et al., 2015). Gene annotations were manually corrected by inspecting the RNA-seq predicted intron sequences and used to generate Sashimi plots using the IGV browser (Robinson et al., 2011; Katz et al., 2015). The Sashimi Plot parameter Junction Coverage Min was adjusted for each species to best visualize the exon-exon junctions based upon coverage data. To identify the duplicated region for the NURF-1.B and NURF-1.D isoforms, we blasted each B isoform against a database of the D isoforms, and vice-versa. The homologous regions for each protein were refined using a multiple sequence alignment of NURF-1.B and NURF-1.D proteins using Jalview (Waterhouse et al., 2009). For some of the species that we were unable to resolve the full nurf-1 region (due to missing sequence for part of the region), we were able to identify the duplicated region and included this in the phylogenetic analysis.

Phylogenetic analysis

We aligned the protein sequences of the duplicated region from the nurf-1 loci of 21 Caenorhabditis species using MAFFT (Katoh and Standley, 2013). We also aligned the protein sequences for regions outside the duplicated region. Maximum likelihood trees were estimated for each alignment along with 1000 ultrafast bootstraps (Hoang et al., 2018) using IQ-TREE (Nguyen et al., 2015), allowing the best-fitting substitution model to be automatically selected (Kalyaanamoorthy et al., 2017). We noted that the resulting topology recovered for the duplicated region was incongruent with the species tree, likely due to limited phylogenetic signal in the short alignment (Figure 6—figure supplement 6). To address this, we instead assessed the levels of support for alternative phylogenetic hypothesis surrounding the number and timing of duplication events that we congruent with the species tree. Log-likelihoods were calculated for each topology and an approximately unbiased (AU) test (Shimodaira, 2002) was performed using IQ-TREE. Newick trees were visualized using the iTOL web server (Letunic and Bork, 2016).

For three pairs of closely-related sister taxa (C. briggsae/C. nigoni, C. latens/C. remanei, and C. afra/C. sulstoni), we aligned the protein sequences of both nurf-1–1 (nurf-1.b) and nurf1-2 (nurf-1.d) using MAFFT and converted the resulting alignments to nucleotide alignments using PAL2NAL (Suyama et al., 2006). We calculated the dN/dS ratio (Ka/Ks) separately for the duplicated and non-duplicated portions of each alignment using the dnds Python module (available at: https://github.com/adelq/dnds).

Statistics

Sample size was calculated by following replicate numbers using previously published assays. Each data point was considered a biological replicate. Animals for each replicate were grown independently for at least three generations. Significant differences between two means were determined using two-tailed unpaired t-test. To correct for multiple comparison, we used the Tukey multiple comparison test.

Proteomics

MCF-7 whole cell extracts were obtained by lysis in either NP-40 (see above) or Laemmli buffer, in both cases supplemented with protease inhibitors and loaded in NuPAGE 3–8% Tris-acetate precast polyacrylamide gels (75 μg of protein per well). Gels were cut into two slices for western blotting and Coomassie staining. Gels bands running at the mobility of BPTF signals detected by western were digested with trypsin as previously described (Shevchenko et al., 2006). Briefly, gel bands were cut into 1 mm2 cubes and de-stained with 50 mM ammonium bicarbonate (ABC) solution. Then proteins were reduced with 15 mM TCEP and alkylated with 30 mM CAA at 45°C, for 45 min in the dark. Proteins were digested with 200 ng of Trypsin (Promega) overnight at 37°C in 50 mM ABC. Resulting peptides were desalted using homemade reversed phase micro-columns containing C18 Empore disks (3M) at the bottom of the tip. Samples were dried down using a Speed-Vac and dissolved in 22 µL of loading buffer (0.2% formic acid) prior LC-MS/MS analysis.

LC-MS/MS was performed by coupling an Ultimate 3000 RSLCnano System (Dionex) with a Q-Exactive Plus mass spectrometer (Thermo Scientific). Peptides were loaded into a trap column (Acclaim PepMap 100; 100 µm × 2 cm; Thermo Scientific) over 3 min at a flow rate of 10 µl/min in 0.1% formic acid (FA). Then peptides were transferred to an analytical column (PepMap rapid separation liquid chromatography C18; 2 µm, 75 µm × 50 cm; Thermo Scientific) and separated using a 90 min effective linear gradient (buffer A: 0.1% FA; buffer B: 100% acetonitrile, 0.1% FA) at a flow rate of 250 nl/min. The gradient used was as follows: 0–5 min 4% B, 5–7.5 min 6% B, 7.5–60 min 17.5% B, 60–72.5 min 21.5% B, 72.5–80 min 25% B, 80–94 min 42.5% B, 94–94.1 min 98% B, 94.1–99.9 min 98% B, 99.9–100 min 4% B and 100–104.5 min 4% B. The peptides were electrosprayed (2.1 keV) into the mass spectrometer through a heated capillary at 300°C and an S-Lens radio frequency (RF) level of 50%. The mass spectrometer was operated in a data-dependent mode, with an automatic switch between the MS and MS/MS scans using a top 15 method (minimum automatic gain control target, 3E3) and a dynamic exclusion time of 26 s. MS (350–1,400 m/z), and MS/MS spectra were acquired with a resolution of 70,000 and 17,500 full width at half maximum (FWHM; 200 m/z), respectively. Peptides were isolated using a 2 Thompson unit (Th) window and fragmented using higher-energy collisional dissociation at 27% normalized collision energy. The ion target values were 3E6 for MS (25 ms maximum injection time) and 1E5 for MS/MS (45 ms maximum injection time).

Raw files were processed with MaxQuant (v 1.6.2.6) using the standard settings against a human protein database (UniProtKB/Swiss-Prot, 20,373 sequences) including all annotated BPTF isoforms deposited in TrEMBL and supplemented with contaminants. Carbamidomethylation of cysteines was set as a fixed modification whereas oxidation of methionines and protein N-term acetylation were set as variable modifications. Minimal peptide length was set to seven amino acids and a maximum of two tryptic missed-cleavages were allowed. Results were filtered at 0.01 FDR (peptide and protein level).

Acknowledgements

We thank the Caenorhabditis Genetics Center, which is funded by NIH Office of Research Infrastructure Programs (P40 OD010440), for strains, and WormBase for information. We are grateful to Rachael Workman and Winston Timp for sharing Oxford Nanopore reads of nurf-1 prior to publication. We thank Matthew Rockman, Luke Noble, Janna Fierst, Erich Schwarz, and Janet Young for access to unpublished genomic data. We thank F.X Real for support, valuable discussions, and comments on the manuscript and G Roncador and the CNIO Monoclonal Antibody Core Unit for helpful contributions. We also thank Todd Streelman, Greg Gibson, Soojin Yi, David Katz, Annalise Paaby, and members of the McGrath lab for discussions, and Annalise Paaby and Erik Andersen for comments on the manuscript. This work was supported by NIH R01GM114170 (to PTM), R01GM121688 (to R E E), and a CNIO friends/Juegaterapia grant (to IF). Work at CNIO was supported, in part, by grant RTI2018-101071-B-I00 from Ministerio de Ciencia, Innovación y Universidades. CNIO is supported by Ministerio de Ciencia, Innovación y Universidades as a Centro de Excelencia Severo Ochoa SEV-2015–0510.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Patrick T McGrath, Email: patrick.mcgrath@biology.gatech.edu.

Erich M Schwarz, Cornell University, United States.

Detlef Weigel, Max Planck Institute for Developmental Biology, Germany.

Funding Information

This paper was supported by the following grants:

  • National Institute of General Medical Sciences R01GM114170 to Patrick T McGrath.

  • National Institute of General Medical Sciences R01GM121688 to Ronald E Ellis.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Resources, Data curation, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing.

Resources, Data curation, Formal analysis, Investigation, Visualization, Writing—review and editing.

Formal analysis, Investigation, Writing—review and editing.

Formal analysis, Investigation, Visualization, Methodology, Writing—review and editing.

Data curation, Formal analysis, Validation, Investigation, Visualization, Methodology.

Data curation, Formal analysis, Validation, Investigation.

Conceptualization, Formal analysis, Funding acquisition, Investigation, Writing—review and editing.

Conceptualization, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing.

Additional files

Supplementary file 1. RNA-seq counts for each gene.
elife-48119-supp1.xlsx (1.4MB, xlsx)
DOI: 10.7554/eLife.48119.038
Supplementary file 2. GO Category analysis for intron SNV regulon.
elife-48119-supp2.xlsx (10.4KB, xlsx)
DOI: 10.7554/eLife.48119.039
Supplementary file 3. Guide RNAs for CRISPR-Cas9 genome edits.
elife-48119-supp3.xlsx (11.5KB, xlsx)
DOI: 10.7554/eLife.48119.040
Transparent reporting form
DOI: 10.7554/eLife.48119.041

Data availability

Sequencing reads were uploaded to the SRA under PRJNA526473.

The following dataset was generated:

Xu W, Long L, McGrath P. 2019. RNAseq of C. elegans under different genetic background and heat shock treatment to study the roles of different isoforms of nurf-1. NCBI Sequence Read Archive. PRJNA526473

The following previously published datasets were used:

Jian Li, Laetitia Chauve, Grace Phelps, Renée M Brielmann, Richard I Morimoto. 2016. RNA-seq analysis in C. elegans larval development and heat shock. NCBI Sequence Read Archive. PRJNA321853

Jessica Brunquell, Stephanie Morris, Yin Lu, Feng Cheng, Sandy D Westerheide. 2016. The genome-wide role of HSF-1 in the regulation of gene expression in Caenorhabditis elegans. NCBI Sequence Read Archive. PRJNA311958

References

  1. Alkhatib SG, Landry JW. The nucleosome remodeling factor. FEBS Letters. 2011;585:3197–3207. doi: 10.1016/j.febslet.2011.09.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Andersen EC, Lu X, Horvitz HR. C. elegans ISWI and NURF301 antagonize an Rb-like pathway in the determination of multiple cell fates. Development. 2006;133:2695–2704. doi: 10.1242/dev.02444. [DOI] [PubMed] [Google Scholar]
  3. Arribere JA, Bell RT, Fu BX, Artiles KL, Hartman PS, Fire AZ. Efficient marker-free recovery of custom genetic modifications with CRISPR/Cas9 in Caenorhabditis elegans. Genetics. 2014;198:837–846. doi: 10.1534/genetics.114.169730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bachmanov AA, Beauchamp GK. Taste receptor genes. Annual Review of Nutrition. 2007;27:389–414. doi: 10.1146/annurev.nutr.26.061505.111329. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Badenhorst P, Voas M, Rebay I, Wu C. Biological functions of the ISWI chromatin remodeling complex NURF. Genes & Development. 2002;16:3186–3198. doi: 10.1101/gad.1032202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Badenhorst P, Xiao H, Cherbas L, Kwon SY, Voas M, Rebay I, Cherbas P, Wu C. The Drosophila nucleosome remodeling factor NURF is required for Ecdysteroid signaling and metamorphosis. Genes & Development. 2005;19:2540–2545. doi: 10.1101/gad.1342605. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Balbás-Martínez C, Sagrera A, Carrillo-de-Santa-Pau E, Earl J, Márquez M, Vazquez M, Lapi E, Castro-Giner F, Beltran S, Bayés M, Carrato A, Cigudosa JC, Domínguez O, Gut M, Herranz J, Juanpere N, Kogevinas M, Langa X, López-Knowles E, Lorente JA, Lloreta J, Pisano DG, Richart L, Rico D, Salgado RN, Tardón A, Chanock S, Heath S, Valencia A, Losada A, Gut I, Malats N, Real FX. Recurrent inactivation of STAG2 in bladder Cancer is not associated with aneuploidy. Nature Genetics. 2013;45:1464–1469. doi: 10.1038/ng.2799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Barak O, Lazzaro MA, Lane WS, Speicher DW, Picketts DJ, Shiekhattar R. Isolation of human NURF: a regulator of engrailed gene expression. The EMBO Journal. 2003;22:6089–6100. doi: 10.1093/emboj/cdg582. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Barbosa-Morais NL, Irimia M, Pan Q, Xiong HY, Gueroussov S, Lee LJ, Slobodeniuc V, Kutter C, Watt S, Colak R, Kim T, Misquitta-Ali CM, Wilson MD, Kim PM, Odom DT, Frey BJ, Blencowe BJ. The evolutionary landscape of alternative splicing in vertebrate species. Science. 2012;338:1587–1593. doi: 10.1126/science.1230612. [DOI] [PubMed] [Google Scholar]
  10. Berry FB, Miura Y, Mihara K, Kaspar P, Sakata N, Hashimoto-Tamaoki T, Tamaoki T. Positive and negative regulation of myogenic differentiation of C2C12 cells by isoforms of the multiple homeodomain zinc finger transcription factor ATBF1. Journal of Biological Chemistry. 2001;276:25057–25065. doi: 10.1074/jbc.M010378200. [DOI] [PubMed] [Google Scholar]
  11. Bi Y, Ren X, Li R, Ding Q, Xie D, Zhao Z. Specific interactions between autosome and X chromosomes cause hybrid male sterility in Caenorhabditis species. Genetics. 2019;212:801–813. doi: 10.1534/genetics.119.302202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Blumenthal T. Trans-splicing and operons in C. elegans. WormBook : The Online Review of C. Elegans Biology. 2012 doi: 10.1895/wormbook.1.5.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Bowser R, Giambrone A, Davies P. FAC1, a novel gene identified with the monoclonal antibody Alz50, is developmentally regulated in human brain. Developmental Neuroscience. 1995;17:20–37. doi: 10.1159/000111270. [DOI] [PubMed] [Google Scholar]
  14. Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nature Biotechnology. 2016;34:525–527. doi: 10.1038/nbt.3519. [DOI] [PubMed] [Google Scholar]
  15. Brunquell J, Morris S, Lu Y, Cheng F, Westerheide SD. The genome-wide role of HSF-1 in the regulation of gene expression in Caenorhabditis elegans. BMC Genomics. 2016;17:559. doi: 10.1186/s12864-016-2837-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Buganim Y, Goldstein I, Lipson D, Milyavsky M, Polak-Charcon S, Mardoukh C, Solomon H, Kalo E, Madar S, Brosh R, Perelman M, Navon R, Goldfinger N, Barshack I, Yakhini Z, Rotter V. A novel translocation breakpoint within the BPTF gene is associated with a pre-malignant phenotype. PLOS ONE. 2010;5:e9657. doi: 10.1371/journal.pone.0009657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Cao J, Packer JS, Ramani V, Cusanovich DA, Huynh C, Daza R, Qiu X, Lee C, Furlan SN, Steemers FJ, Adey A, Waterston RH, Trapnell C, Shendure J. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science. 2017;357:661–667. doi: 10.1126/science.aam8940. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Chan YF, Marks ME, Jones FC, Villarreal G, Shapiro MD, Brady SD, Southwick AM, Absher DM, Grimwood J, Schmutz J, Myers RM, Petrov D, Jónsson B, Schluter D, Bell MA, Kingsley DM. Adaptive evolution of pelvic reduction in sticklebacks by recurrent deletion of a Pitx1 enhancer. Science. 2010;327:302–305. doi: 10.1126/science.1182213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Chelo IM, Afonso B, Carvalho S, Theologidis I, Goy C, Pino-Querido A, Proulx SR, Teotónio H. Partial selfing can reduce genetic loads while maintaining diversity during experimental evolution. G3: Genes|Genomes|Genetics. 2019;9:2811–2821. doi: 10.1534/g3.119.400239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Chen X, Shen Y, Ellis RE. Dependence of the sperm/oocyte decision on the nucleosome remodeling factor complex was acquired during recent Caenorhabditis briggsae evolution. Molecular Biology and Evolution. 2014;31:2573–2585. doi: 10.1093/molbev/msu198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Chojnacki S, Cowley A, Lee J, Foix A, Lopez R. Programmatic access to bioinformatics tools from EMBL-EBI update: 2017. Nucleic Acids Research. 2017;45:W550–W553. doi: 10.1093/nar/gkx273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Colosimo PF, Hosemann KE, Balabhadra S, Villarreal G, Dickson M, Grimwood J, Schmutz J, Myers RM, Schluter D, Kingsley DM. Widespread parallel evolution in sticklebacks by repeated fixation of ectodysplasin alleles. Science. 2005;307:1928–1933. doi: 10.1126/science.1107239. [DOI] [PubMed] [Google Scholar]
  23. Cutter AD. Sperm-limited fecundity in Nematodes: how many sperm are enough? Evolution. 2004;58:651–655. [PubMed] [Google Scholar]
  24. de Bono M, Bargmann CI. Natural variation in a neuropeptide Y receptor homolog modifies social behavior and food response in C. elegans. Cell. 1998;94:679–689. doi: 10.1016/S0092-8674(00)81609-8. [DOI] [PubMed] [Google Scholar]
  25. Des Marais DL, Rausher MD. Escape from adaptive conflict after duplication in an anthocyanin pathway gene. Nature. 2008;454:762–765. doi: 10.1038/nature07092. [DOI] [PubMed] [Google Scholar]
  26. Duveau F, Félix MA. Role of pleiotropy in the evolution of a cryptic developmental variation in Caenorhabditis elegans. PLOS Biology. 2012;10:e1001230. doi: 10.1371/journal.pbio.1001230. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Fierst JL, Willis JH, Thomas CG, Wang W, Reynolds RM, Ahearne TE, Cutter AD, Phillips PC. Reproductive mode and the evolution of genome size and structure in Caenorhabditis nematodes. PLOS Genetics. 2015;11:e1005323. doi: 10.1371/journal.pgen.1005323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Frey WD, Chaudhry A, Slepicka PF, Ouellette AM, Kirberger SE, Pomerantz WCK, Hannon GJ, Dos Santos CO. BPTF maintains chromatin accessibility and the Self-Renewal capacity of mammary gland stem cells. Stem Cell Reports. 2017;9:23–31. doi: 10.1016/j.stemcr.2017.04.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Frøkjær-Jensen C. Transposon-Assisted genetic engineering with Mos1-Mediated Single-Copy insertion (MosSCI) Methods in Molecular Biology. 2015;1327:49–58. doi: 10.1007/978-1-4939-2842-2_5. [DOI] [PubMed] [Google Scholar]
  30. Fyodorov DV, Kadonaga JT. Binding of Acf1 to DNA involves a WAC motif and is important for ACF-mediated chromatin assembly. Molecular and Cellular Biology. 2002;22:6344–6353. doi: 10.1128/MCB.22.18.6344-6353.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Goller T, Vauti F, Ramasamy S, Arnold HH. Transcriptional regulator BPTF/FAC1 is essential for trophoblast differentiation during early mouse development. Molecular and Cellular Biology. 2008;28:6819–6827. doi: 10.1128/MCB.01058-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Gray JC, Cutter AD. Mainstreaming Caenorhabditis elegans in experimental evolution. Proceedings of the Royal Society B: Biological Sciences. 2014;281:20133055. doi: 10.1098/rspb.2013.3055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Greene JS, Brown M, Dobosiewicz M, Ishida IG, Macosko EZ, Zhang X, Butcher RA, Cline DJ, McGrath PT, Bargmann CI. Balancing selection shapes density-dependent foraging behaviour. Nature. 2016a;539:254–258. doi: 10.1038/nature19848. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Greene JS, Dobosiewicz M, Butcher RA, McGrath PT, Bargmann CI. Regulatory changes in two chemoreceptor genes contribute to a Caenorhabditis elegans QTL for foraging behavior. eLife. 2016b;5:e21454. doi: 10.7554/eLife.21454. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Hoang DT, Chernomor O, von Haeseler A, Minh BQ, Vinh LS. UFBoot2: improving the ultrafast bootstrap approximation. Molecular Biology and Evolution. 2018;35:518–522. doi: 10.1093/molbev/msx281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Hodgkin J, Barnes TM. More is not better: brood size and population growth in a self-fertilizing nematode. Proceedings of the Royal Society B. Biological Sciences. 1991;246:19–24. doi: 10.1098/rspb.1991.0119. [DOI] [PubMed] [Google Scholar]
  37. Hubbard EJ, Greenstein D. Introduction to the germ line. WormBook. 2005 doi: 10.1895/wormbook.1.18.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Hughes AL. The evolution of functionally novel proteins after gene duplication. Proceedings of the Royal Society B. Biological Sciences. 1994;256:119–124. doi: 10.1098/rspb.1994.0058. [DOI] [PubMed] [Google Scholar]
  39. Innan H, Kondrashov F. The evolution of gene duplications: classifying and distinguishing between models. Nature Reviews Genetics. 2010;11:97–108. doi: 10.1038/nrg2689. [DOI] [PubMed] [Google Scholar]
  40. Jungreis I, Lin MF, Spokony R, Chan CS, Negre N, Victorsen A, White KP, Kellis M. Evidence of abundant stop codon readthrough in Drosophila and other metazoa. Genome Research. 2011;21:2096–2113. doi: 10.1101/gr.119974.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nature Methods. 2017;14:587–589. doi: 10.1038/nmeth.4285. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Kanzaki N, Tsai IJ, Tanaka R, Hunt VL, Liu D, Tsuyama K, Maeda Y, Namai S, Kumagai R, Tracey A, Holroyd N, Doyle SR, Woodruff GC, Murase K, Kitazume H, Chai C, Akagi A, Panda O, Ke HM, Schroeder FC, Wang J, Berriman M, Sternberg PW, Sugimoto A, Kikuchi T. Biology and genome of a newly discovered sibling species of Caenorhabditis elegans. Nature Communications. 2018;9:3216. doi: 10.1038/s41467-018-05712-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Molecular Biology and Evolution. 2013;30:772–780. doi: 10.1093/molbev/mst010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Katz Y, Wang ET, Silterra J, Schwartz S, Wong B, Thorvaldsdóttir H, Robinson JT, Mesirov JP, Airoldi EM, Burge CB. Quantitative visualization of alternative exon expression from RNA-seq data. Bioinformatics. 2015;31:2400–2402. doi: 10.1093/bioinformatics/btv034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Keller A, Zhuang H, Chi Q, Vosshall LB, Matsunami H. Genetic variation in a human odorant receptor alters odour perception. Nature. 2007;449:468–472. doi: 10.1038/nature06162. [DOI] [PubMed] [Google Scholar]
  46. Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nature Methods. 2015;12:357–360. doi: 10.1038/nmeth.3317. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Kiontke KC, Félix MA, Ailion M, Rockman MV, Braendle C, Pénigault JB, Fitch DH. A phylogeny and molecular barcodes for Caenorhabditis, with numerous new species from rotting fruits. BMC Evolutionary Biology. 2011;11:339. doi: 10.1186/1471-2148-11-339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Koludrovic D, Laurette P, Strub T, Keime C, Le Coz M, Coassolo S, Mengus G, Larue L, Davidson I. Chromatin-Remodelling complex NURF is essential for differentiation of adult melanocyte stem cells. PLOS Genetics. 2015;11:e1005555. doi: 10.1371/journal.pgen.1005555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Kwon SY, Xiao H, Glover BP, Tjian R, Wu C, Badenhorst P. The nucleosome remodeling factor (NURF) regulates genes involved in Drosophila innate immunity. Developmental Biology. 2008;316:538–547. doi: 10.1016/j.ydbio.2008.01.033. [DOI] [PubMed] [Google Scholar]
  50. Kwon SY, Xiao H, Wu C, Badenhorst P. Alternative splicing of NURF301 generates distinct NURF chromatin remodeling complexes with altered modified histone binding specificities. PLOS Genetics. 2009;5:e1000574. doi: 10.1371/journal.pgen.1000574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Lamelza P, Young JM, Noble LM, Isakharov A, Palanisamy M, Rockman MV, Malik HS, Ailion M. Cryptic asexual reproduction in Caenorhabditis nematodes revealed by interspecies hybridization. bioRxiv. 2019 doi: 10.1101/588152. [DOI] [PMC free article] [PubMed]
  52. Landry J, Sharov AA, Piao Y, Sharova LV, Xiao H, Southon E, Matta J, Tessarollo L, Zhang YE, Ko MS, Kuehn MR, Yamaguchi TP, Wu C. Essential role of chromatin remodeling protein bptf in early mouse embryos and embryonic stem cells. PLOS Genetics. 2008;4:e1000241. doi: 10.1371/journal.pgen.1000241. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Landry JW, Banerjee S, Taylor B, Aplan PD, Singer A, Wu C. Chromatin remodeling complex NURF regulates thymocyte maturation. Genes & Development. 2011;25:275–286. doi: 10.1101/gad.2007311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Large EE, Xu W, Zhao Y, Brady SC, Long L, Butcher RA, Andersen EC, McGrath PT. Selection on a subunit of the NURF chromatin remodeler modifies life history traits in a domesticated strain of Caenorhabditis elegans. PLOS Genetics. 2016;12:e1006219. doi: 10.1371/journal.pgen.1006219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Large EE, Padmanabhan R, Watkins KL, Campbell RF, Xu W, McGrath PT. Modeling of a negative feedback mechanism explains antagonistic pleiotropy in reproduction in domesticated Caenorhabditis elegans strains. PLOS Genetics. 2017;13:e1006769. doi: 10.1371/journal.pgen.1006769. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Letunic I, Bork P. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Research. 2016;44:W242–W245. doi: 10.1093/nar/gkw290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Li H, Ilin S, Wang W, Duncan EM, Wysocka J, Allis CD, Patel DJ. Molecular basis for site-specific read-out of histone H3K4me3 by the BPTF PHD finger of NURF. Nature. 2006;442:91–95. doi: 10.1038/nature04802. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup The sequence alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Li J, Chauve L, Phelps G, Brielmann RM, Morimoto RI. E2F coregulates an essential HSF developmental program that is distinct from the heat-shock response. Genes & Development. 2016;30:2062–2075. doi: 10.1101/gad.283317.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Lunde K, Egelandsdal B, Skuterud E, Mainland JD, Lea T, Hersleth M, Matsunami H. Genetic variation of an odorant receptor OR7D4 and sensory perception of cooked meat containing androstenone. PLOS ONE. 2012;7:e35259. doi: 10.1371/journal.pone.0035259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Mallarino R, Linden TA, Linnen CR, Hoekstra HE. The role of isoforms in the evolution of cryptic coloration in Peromyscus mice. Molecular Ecology. 2017;26:245–258. doi: 10.1111/mec.13663. [DOI] [PubMed] [Google Scholar]
  62. Mariani L, Lussi YC, Vandamme J, Riveiro A, Salcini AE. The H3K4me3/2 histone demethylase RBR-2 controls axon guidance by repressing the actin-remodeling gene wsp-1. Development. 2016;143:851–863. doi: 10.1242/dev.132985. [DOI] [PubMed] [Google Scholar]
  63. Martin A, Orgogozo V. The loci of repeated evolution: a catalog of genetic hotspots of phenotypic variation. Evolution. 2013;22:1235–1250. doi: 10.1111/evo.12081. [DOI] [PubMed] [Google Scholar]
  64. McBride CS, Baier F, Omondi AB, Spitzer SA, Lutomiah J, Sang R, Ignell R, Vosshall LB. Evolution of mosquito preference for humans linked to an odorant receptor. Nature. 2014;515:222–227. doi: 10.1038/nature13964. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. McGinnis S, Madden TL. BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Research. 2004;32:W20–W25. doi: 10.1093/nar/gkh435. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. McGrath PT, Rockman MV, Zimmer M, Jang H, Macosko EZ, Kruglyak L, Bargmann CI. Quantitative mapping of a digenic behavioral trait implicates globin variation in C. elegans sensory behaviors. Neuron. 2009;61:692–699. doi: 10.1016/j.neuron.2009.02.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. McGrath PT, Xu Y, Ailion M, Garrison JL, Butcher RA, Bargmann CI. Parallel evolution of domesticated Caenorhabditis species targets pheromone receptor genes. Nature. 2011;477:321–325. doi: 10.1038/nature10378. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. McRae JF, Mainland JD, Jaeger SR, Adipietro KA, Matsunami H, Newcomb RD. Genetic variation in the odorant receptor OR2J3 is associated with the ability to detect the “Grassy” Smelling Odor, cis-3-hexen-1-ol. Chemical Senses. 2012;37:585–593. doi: 10.1093/chemse/bjs049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Merkin J, Russell C, Chen P, Burge CB. Evolutionary dynamics of gene and isoform regulation in mammalian tissues. Science. 2012;338:1593–1599. doi: 10.1126/science.1228186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Mortazavi A, Schwarz EM, Williams B, Schaeffer L, Antoshechkin I, Wold BJ, Sternberg PW. Scaffolding a Caenorhabditis nematode genome with RNA-seq. Genome Research. 2010;20:1740–1747. doi: 10.1101/gr.111021.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Müller B, Basler K. The repressor and activator forms of Cubitus interruptus control Hedgehog target genes through common generic gli-binding sites. Development. 2000;127:2999–3007. doi: 10.1242/dev.127.14.2999. [DOI] [PubMed] [Google Scholar]
  72. Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating Maximum-Likelihood phylogenies. Molecular Biology and Evolution. 2015;32:268–274. doi: 10.1093/molbev/msu300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Ohno S. Evolution by Gene Duplication. London, New York: Allen & Unwin; Springer-Verlag; 1970. [Google Scholar]
  74. Paix A, Folkmann A, Rasoloson D, Seydoux G. High efficiency, Homology-Directed genome editing in Caenorhabditis elegans Using CRISPR-Cas9 Ribonucleoprotein Complexes. Genetics. 2015;201:47–54. doi: 10.1534/genetics.115.179382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Pal S, Gupta R, Kim H, Wickramasinghe P, Baubet V, Showe LC, Dahmane N, Davuluri RV. Alternative transcription exceeds alternative splicing in generating the transcriptome diversity of cerebellar development. Genome Research. 2011;21:1260–1272. doi: 10.1101/gr.120535.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nature Genetics. 2008;40:1413–1415. doi: 10.1038/ng.259. [DOI] [PubMed] [Google Scholar]
  77. Penley MJ, Greenberg AB, Khalid A, Namburar SR, Morran LT. No measurable fitness cost to experimentally evolved host defence in the Caenorhabditis elegans-Serratia marcescens host-parasite system. Journal of Evolutionary Biology. 2018;31:1976–1981. doi: 10.1111/jeb.13372. [DOI] [PubMed] [Google Scholar]
  78. Persson A, Gross E, Laurent P, Busch KE, Bretes H, de Bono M. Natural variation in a neural globin tunes oxygen sensing in wild Caenorhabditis elegans. Nature. 2009;458:1030–1033. doi: 10.1038/nature07820. [DOI] [PubMed] [Google Scholar]
  79. Prior H, Jawad AK, MacConnachie L, Beg AA. Highly efficient, rapid and Co-CRISPR-Independent genome editing in Caenorhabditis elegans. G3: Genes|Genomes|Genetics. 2017;7:3693–3698. doi: 10.1534/g3.117.300216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Richart L, Carrillo-de Santa Pau E, Río-Machín A, de Andrés MP, Cigudosa JC, Lobo VJS, Real FX. BPTF is required for c-MYC transcriptional activity and in vivo tumorigenesis. Nature Communications. 2016;7:10153. doi: 10.1038/ncomms10153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Roach NP, Sadowski N, Alessi AF, Timp W, Taylor J, Kim JK. The full-length transcriptome of C. elegans using direct RNA sequencing. bioRxiv. 2019 doi: 10.1101/598763. [DOI] [PMC free article] [PubMed]
  82. Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. Integrative genomics viewer. Nature Biotechnology. 2011;29:24–26. doi: 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Ruthenburg AJ, Li H, Milne TA, Dewell S, McGinty RK, Yuen M, Ueberheide B, Dou Y, Muir TW, Patel DJ, Allis CD. Recognition of a mononucleosomal histone modification pattern by BPTF via multivalent interactions. Cell. 2011;145:692–706. doi: 10.1016/j.cell.2011.03.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Saxena AS, Salomon MP, Matsuba C, Yeh SD, Baer CF. Evolution of the mutational process under relaxed selection in Caenorhabditis elegans. Molecular Biology and Evolution. 2019;36:239–251. doi: 10.1093/molbev/msy213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Shabalina SA, Ogurtsov AY, Spiridonov NA, Koonin EV. Evolution at protein ends: major contribution of alternative transcription initiation and termination to the transcriptome and proteome diversity in mammals. Nucleic Acids Research. 2014;42:7132–7144. doi: 10.1093/nar/gku342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Shevchenko A, Tomas H, Havlis J, Olsen JV, Mann M. In-gel digestion for mass spectrometric characterization of proteins and proteomes. Nature Protocols. 2006;1:2856–2860. doi: 10.1038/nprot.2006.468. [DOI] [PubMed] [Google Scholar]
  87. Shimodaira H. An approximately unbiased test of phylogenetic tree selection. Systematic Biology. 2002;51:492–508. doi: 10.1080/10635150290069913. [DOI] [PubMed] [Google Scholar]
  88. Slos D, Sudhaus W, Stevens L, Bert W, Blaxter M. Caenorhabditis monodelphis sp. n.: defining the stem morphology and genomics of the genus Caenorhabditis. BMC Zoology. 2017;2:4. doi: 10.1186/s40850-017-0013-2. [DOI] [Google Scholar]
  89. Stein LD, Bao Z, Blasiar D, Blumenthal T, Brent MR, Chen N, Chinwalla A, Clarke L, Clee C, Coghlan A, Coulson A, D'Eustachio P, Fitch DH, Fulton LA, Fulton RE, Griffiths-Jones S, Harris TW, Hillier LW, Kamath R, Kuwabara PE, Mardis ER, Marra MA, Miner TL, Minx P, Mullikin JC, Plumb RW, Rogers J, Schein JE, Sohrmann M, Spieth J, Stajich JE, Wei C, Willey D, Wilson RK, Durbin R, Waterston RH. The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics. PLOS Biology. 2003;1:e45. doi: 10.1371/journal.pbio.0000045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Sterken MG, Snoek LB, Kammenga JE, Andersen EC. The laboratory domestication of Caenorhabditis elegans. Trends in Genetics. 2015;31:224–231. doi: 10.1016/j.tig.2015.02.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Stevens L, Félix MA, Beltran T, Braendle C, Caurcel C, Fausett S, Fitch D, Frézal L, Gosse C, Kaur T, Kiontke K, Newton MD, Noble LM, Richaud A, Rockman MV, Sudhaus W, Blaxter M. Comparative genomics of 10 new Caenorhabditis species. Evolution Letters. 2019;3:217–236. doi: 10.1002/evl3.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Sucena E, Delon I, Jones I, Payre F, Stern DL. Regulatory evolution of shavenbaby/ovo underlies multiple cases of morphological parallelism. Nature. 2003;424:935–938. doi: 10.1038/nature01768. [DOI] [PubMed] [Google Scholar]
  93. Suyama M, Torrents D, Bork P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding Codon alignments. Nucleic Acids Research. 2006;34:W609–W612. doi: 10.1093/nar/gkl315. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Teotónio H, Estes S, Phillips PC, Baer CF. Experimental evolution with Caenorhabditis nematodes. Genetics. 2017;206:691–716. doi: 10.1534/genetics.115.186288. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Ting JJ, Tsai CN, Schalkowski R, Cutter AD. Genetic contributions to ectopic sperm cell migration in Caenorhabditis Nematodes. G3: Genes|Genomics|Genetics. 2018;8:3891–3902. doi: 10.1534/g3.118.200785. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Varet H, Brillet-Guéguen L, Coppée JY, Dillies MA. SARTools: a DESeq2- and EdgeR-Based R pipeline for comprehensive differential analysis of RNA-Seq data. PLOS ONE. 2016;11:e0157022. doi: 10.1371/journal.pone.0157022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Wang C, Xin X, Xiang R, Ramos FJ, Liu M, Lee HJ, Chen H, Mao X, Kikani CK, Liu F, Dong LQ. Yin-Yang regulation of adiponectin signaling by APPL isoforms in muscle cells. Journal of Biological Chemistry. 2009;284:31608–31615. doi: 10.1074/jbc.M109.010355. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Waterhouse AM, Procter JB, Martin DM, Clamp M, Barton GJ. Jalview version 2--a multiple sequence alignment editor and analysis workbench. Bioinformatics. 2009;25:1189–1191. doi: 10.1093/bioinformatics/btp033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Wernick RI, Christy SF, Howe DK, Sullins JA, Ramirez JF, Sare M, Penley MJ, Morran LT, Denver DR, Estes S. Sex and mitonuclear adaptation in experimental Caenorhabditis elegans Populations. Genetics. 2019;211:1045–1058. doi: 10.1534/genetics.119.301935. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Wisotsky Z, Medina A, Freeman E, Dahanukar A. Evolutionary differences in food preference rely on Gr64e, a receptor for glycerol. Nature Neuroscience. 2011;14:1534–1541. doi: 10.1038/nn.2944. [DOI] [PubMed] [Google Scholar]
  101. Wood TE, Burke JM, Rieseberg LH. Parallel genotypic adaptation: when evolution repeats itself. Genetica. 2005;123:157–170. doi: 10.1007/s10709-003-2738-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  102. Wu B, Wang Y, Wang C, Wang GG, Wu J, Wan YY. BPTF is essential for T cell homeostasis and function. The Journal of Immunology. 2016;197:4325–4333. doi: 10.4049/jimmunol.1600642. [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Wysocka J, Swigut T, Xiao H, Milne TA, Kwon SY, Landry J, Kauer M, Tackett AJ, Chait BT, Badenhorst P, Wu C, Allis CD. A PHD finger of NURF couples histone H3 lysine 4 trimethylation with chromatin remodelling. Nature. 2006;442:86–90. doi: 10.1038/nature04815. [DOI] [PubMed] [Google Scholar]
  104. Xu B, Cai L, Butler JM, Chen D, Lu X, Allison DF, Lu R, Rafii S, Parker JS, Zheng D, Wang GG. The chromatin remodeler BPTF activates a stemness Gene-Expression program essential for the maintenance of adult hematopoietic stem cells. Stem Cell Reports. 2018;10:675–683. doi: 10.1016/j.stemcr.2018.01.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Yang L, Wang HN, Hou XH, Zou YP, Han TS, Niu XM, Zhang J, Zhao Z, Todesco M, Balasubramanian S, Guo YL. Parallel evolution of common allelic variants confers flowering diversity in Capsella rubella. The Plant Cell. 2018;30:1322–1336. doi: 10.1105/tpc.18.00124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  106. Yin D, Schwarz EM, Thomas CG, Felde RL, Korf IF, Cutter AD, Schartner CM, Ralston EJ, Meyer BJ, Haag ES. Rapid genome shrinkage in a self-fertile nematode reveals sperm competition proteins. Science. 2018;359:55–61. doi: 10.1126/science.aao0827. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Zhang SJ, Wang C, Yan S, Fu A, Luan X, Li Y, Sunny Shen Q, Zhong X, Chen JY, Wang X, Chin-Ming Tan B, He A, Li CY. Isoform evolution in primates through independent combination of alternative RNA processing events. Molecular Biology and Evolution. 2017;34:2453–2468. doi: 10.1093/molbev/msx212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Zhao Y, Long L, Xu W, Campbell RF, Large EE, Greene JS, McGrath PT. Changes to social feeding behaviors are not sufficient for fitness gains of the Caenorhabditis elegans N2 reference strain. eLife. 2018;7:e38675. doi: 10.7554/eLife.38675. [DOI] [PMC free article] [PubMed] [Google Scholar]
  109. Zhao Y, Wan J, Biliya S, Brady SC, Lee D, Andersen EC, Vannberg FO, Lu H, McGrath PT. A beneficial genomic rearrangement creates multiple versions of calcipressin in C. elegans. bioRxiv. 2019 doi: 10.1101/578088. [DOI]

Decision letter

Editor: Erich M Schwarz1
Reviewed by: Erich M Schwarz2, Eric Haag3

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

[Editors’ note: this article was originally rejected after discussions between the reviewers, but the authors were invited to resubmit after an appeal against the decision.]

Thank you for submitting your work entitled "Evolution of Yin and Yang isoforms of a chromatin remodeling subunit results in the creation of two genes" for consideration by eLife.

We regret to inform you that we cannot, at present, accept your paper for publication in eLife. Because your paper's scientific work is of high quality, we are willing to reconsider a thoroughly revised version. However, we can only do so if you can satisfactorily address each of the three problems discussed below.

Your article has been reviewed by three peer reviewers, including Erich M Schwarz as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by a Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Eric Haag (Reviewer #3). Our decision has been reached after consultation between the reviewers.

The problems raised by the reviewers that necessitated this decision are as follows.

1) The domestication story is interesting, and the evidence that the first intron harbors an important variant vis-a-vis fitness in lab culture is quite strong. However, there is no mechanistic explanation for how it impacts nurf-1 function. This is a major deficiency for an eLife paper.

2) The authors prove that the full-length transcript of nurf-1 is not needed, and they provide evidence that one clade of Caenorhabditis have split the ancestral, single complex nurf-1 gene. However, there remains the question of whether nurf-1's 5'-ward and 3'-ward halves remain connected as part of one operon, with SL2 splicing to the 3'-ward half. Moreover, why the splitting of nurf-1 happened is a) not clear, and b) not connected to the domestication phenotype. It is thus a curiosity unrelated to a major biological event (thus far).

3) Although some evidence of accelerated evolution of the overlap region between Yin and Yang isoforms is provided, there is no compelling evidence that there was an "adaptive conflict" that needed to be resolved and was indeed resolved. The null hypothesis of "both work, and developmental systems drift happens" cannot be rejected. This issue is at the core of your article's claim to biological importance.

All three reviewers agreed that the paper has abundant experimental data of high quality. The problem is that these data are not convincingly linked to the explanations claimed, and the results seem to fall naturally into two distinct papers (one on domestication alleles, the other on the molecular biology and evolution of isoforms arising from the genetically complex nurf-1 locus). Unless the three points above can be addressed, this work may fare better as two shorter studies focused on these two topics.

We have attached the individual reviews by the three reviewers below.

Reviewer #1:

Using molecular biology, reverse genetics, and transcriptomics, Xu et al. make a compelling case for the evolution of two isoforms of nurf-1/BPTF in Caenorhabditis elegans that subdivide functions within this gene; they go on to show that a subgenus of Caenorhabditis species have evolved two entirely separate genes from these isoforms. Their observations provide a coherent explanation for many functions of the nurf-1 gene that had been intermittently seen over the years, but not reconciled. Their analysis of nurf-1 provides a beautiful instance of evolutionary adaptation, both in the sense of nurf-1's evolving into two functional isoforms, and in the sense that nurf-1 itself (as the authors show) mediates phenotypic tradeoffs in life history between early spermatocyte production and later oocyte production, which ultimately affects speed and output of the C. elegans reproductive life cycle that have different optima in different laboratory environments (plate culture for the N2 strain, versus liquid culture for the LSJ2 strain).

The experimental work described here is thorough, rigorous, and adroit.

The manuscript is very well composed. Background material is scholarly but appropriately concise. Generally, text is clean and direct. I was struck by the clarity with which the figures presented data visually. Papers such as this one that rely on genomic or transcriptomic analysis can be, in my experience, quite obtusely written or illustrated with figures that do not convey data well; Xu et al. have opposed these tendencies, for which I thank them.

I was impressed with the authors' efficient use of previously published RNA-seq data for heat shock expression, and of prepublication Nanopore long-RNA-seq data from Roach et al., 2019. By the time Xu et al. is published, I hope that the preprint Roach et al. can be cited as a published journal article.

In the text, I encountered only two points that I found difficult to decipher, a few missing data points, and one issue about nomenclature.

The first unclear point involved strain PTM417 versus strain PTM88. In Figure 1D, one of the strains shown is PTM417, but this is referred to as PTM88 in the figure legend. Reading the text carefully twice and reviewing the Materials and methods did not clarify for me which was the correct strain for Figure 1, although I suspect the correct strain is PTM417 rather than PTM88. In particular, the description of PTM417 in the Strains list (Materials and methods) is not consistent with the description of how PTM417 was constructed (subsection “CRISPR-generated allelic replacement lines (ARLs)”). I request that the authors entirely resolve this confusion.

The second unclear point involved the "Normalized Size" assay shown in Figure 3C. Two readings of the manuscript and one careful reading of the Materials and methods did not clarify, for me, what was meant by 'size' (though my guess was that it was body size). I went back to their key reference for many methods (Large et al., 2016) and concluded that the authors actually mean "Normalized body size". This should be made completely clear to the reader, both in the y-axis labeling of Figure 3C and in the legend text for Figure 3C, as well as in the Materials and methods.

One missing data point is in Table 2. This table should list a stop-codon location for kah11 in the isoform NURF-1.F, and only in that one isoform. Instead, Table 2 currently gives no stop-codon mutations for kah11 at all. In addition, Figure 3D (which is supposed to show all stop codons in their structural context) also omits kah11. These omissions from both Table 2 and Figure 3D should be corrected.

A second missing data point appears to be in the genotype table of Figure 4B. For the genotype +/kah96, this table claims that there are 0 functional copies of the isoform nurf-1.d; however, I do not see how this can be possible, given that the wild-type allele [+] should encode at least 1 functional copy of nurf-1.d. Unless there is something that I am badly misunderstanding (always possible), please correct this.

About nomenclature: the authors have dubbed the NURF-1.B isoform "Yin" and the NURF-1.D isoform "Yang". I assume that this is because of their order of expression (NURF-1.B is expressed before NURF-1.D), so that calling them "Yin and Yang" follows their times of activity. However, I found the nomenclature confusing because Yang is associated with stereotypically masculine traits and entities, whereas Yin is associated with feminine ones. Yet, the authors' current nomenclature has Yin assigned to the isoform promoting spermatogenesis (NURF-1.B), while Yang is assigned to the isoform promoting oogenesis (NURF-1.D). Since I assume that the authors would like to make it *easy* for people to properly remember which isoform does what, I would strongly encourage them to switch their nomenclature (so that NURF-1.B gets called Yang, and NURF-1.D gets called Yin).

Reviewer #2:

Here, Xu et al. report several splice isoforms of the C. elegans nurf-1 gene and show that they have different, possibly opposite, effects on certain aspects of gametogenesis. This has implications for fitness. The authors also found that the complex nurf-1 locus has undergone a partial duplication in several close relatives of C. elegans, thereby segregating distinct functions of nurf-1 into two separate genes.

1) The finding that gene expression profiles of N2 and ARL strains differ considerably at 52h, but apparently not at 60h is (Figure 1F) seems quite interesting. Surprisingly, it was not explored further. At least it should be commented on in a more elaborate way.

2) In the second paragraph of the subsection “The B and D isoforms have opposite effects on cell fate during gametogenesis”, the authors state that the observed reduction in the number of sperm is due to earlier sperm-to-oocyte switch. This is plausible, but other causes are also possible and I would encourage the authors to provide direct experimental support for this claim.

3) Are defects in sperm production of certain alleles of nurf-1 alleles (e.g. kah106) unique to hermaphrodites or are they seen in males as well? Conversely, are oocyte production defects of kah93 an issue in mutants that only produce oocytes? Addressing these questions could help to support the notion that different isoforms contribute to different aspects of the tradeoff between sperm and oocyte production. Are the observed defects in the numbers of gametes due to erroneous timing of the switch or some other problem? Can anything be said about the functions of different isoforms in gonochoristic species in which sperm vs. egg conflict is not a concern?

4) The concluding sentence of the Results section claims that the rate of amino acid replacements has accelerated in the duplicated exons. I did not find a formal test supporting this assertion.

5) Are the authors aware of this paper – Hughes, 1994?

Reviewer #3:

In this paper, Xu et al. present a meticulous examination of the nurf-1 locus of C. elegans. The study presents several interesting findings:

1) New evidence (beyond published studies) is presented indicating that nurf-1 has been a target of selection during domestication of laboratory strains, including the N2 and LSJ2 strains. In particular, the authors make a strong case for a major role of an intronic single nucleotide variant (SNV) in mediating adaptation to the NGM plate culture in the N2 lineage.

2) The authors then shift to a detailed characterization of the various nurf-1 transcripts and their necessity for growth and sustained fertility. This is most impressively supported by a battery of engineered stop codons. The results strongly suggest that the nurf-1.b and nurf-1.d transcripts, which overlap slightly, are the key effectors of nurf-1 function, and that the full-length nurf-1.a transcript is likely dispensable.

3) While isoform-restricted stop codons that reduce or eliminate nurf-1.b and nurf-1.d function both reduce self-fertility, they do so via opposite effects. nurf-1.b appears to be necessary to support robust spermatogenesis (i.e. it's loss leads to a partial Fog), while nurf-1.d has a role in promoting the switch from spermatogenesis to oogenesis (i.e. it is a partial Mog).

4) In the clade of Caenorhabiditis species that includes C. brenneri, the partially overlapping transcripts have been completely separated via a lineage-specific duplication of the exons that were historically shared.

Overall, this paper is a genetics tour de force. I do have a few suggestions, that if addressed, would tighten up the story:

1) While there appears to be a surprisingly major effect of the intronic SNV on both global gene expression and fitness, nothing is said about how this change in a homopolymeric run alters nurf-1's own gene expression. One might expect this to impact nurf-1.b transcription or splicing, and this could, in turn, alter NURF-1.B levels. Can the authors provide any data about the functional impact of the SNV on nurf-1? For example, an isoform frequency chart like that for Figure 2C, but comparing PTM228 and the ARL(intron, LSJ2>N2), would be very informative as to the mechanism by which the intron SNV impact phenotype.

2) We are told that "brood size of C. elegans hermaphrodites is an important trait for evolutionary fitness in laboratory conditions." Indeeed it is, but the timing also matters, with early progeny much more valuable than late progeny. Looking at the reproductive schedules in Figure 1—figure supplement 1 (and also in the Large et al., 2016 paper), there is a real shift in timing. Can the authors model, or at least speculate on, the expected impact of this?

3) In the species that have a fully separated nurf-1.1 and nurf-1.2 genes, there is very little space between the two. Have the authors looked to see if the 5' end of the transcript from the downstream gene is spliced to SL2? If so, that would indicate these have formed an operon.

eLife. 2019 Sep 9;8:e48119. doi: 10.7554/eLife.48119.050

Author response


[Editors’ note: the author responses to the first round of peer review follow.]

The problems raised by the reviewers that necessitated this decision are as follows.

1) The domestication story is interesting, and the evidence that the first intron harbors an important variant vis-a-vis fitness in lab culture is quite strong. However, there is no mechanistic explanation for how it impacts nurf-1 function. This is a major deficiency for an eLife paper.

To address mechanism of the intron SNV, we include analysis of nurf-1 isoform expression using RNA-seq that was already included in the paper (Figure 2—figure supplement 3). This analysis demonstrates that the effect on nurf-1 expression is subtle, as we were unable to identify significant differences in expression at timepoints that when phenotypic effects were observed (i.e.

spermatogenesis and expression of other genes). While disappointing from a mechanistic perspective, this serves as an interesting demonstration that the effect of a genetic variant on fitness can be significant despite having a subtle effect on transcription.

2) The authors prove that the full-length transcript of nurf-1 is not needed, and they provide evidence that one clade of Caenorhabditis have split the ancestral, single complex nurf-1 gene. However, there remains the question of whether nurf-1's 5'-ward and 3'-ward halves remain connected as part of one operon, with SL2 splicing to the 3'-ward half. Moreover, why the splitting of nurf-1 happened is a) not clear, and b) not connected to the domestication phenotype. It is thus a curiosity unrelated to a major biological event (thus far).

We do not believe that this is the case for the following reasons: 1. In C. elegans, the b and d isoforms are expressed from independent promoters. For these genes to be expressed as an operon, additional genetic changes must have fixed. 2. Previously published work from the Ellis lab supports that sl1 leader sequence is spliced to nurf-1-2 (Chen et al. – “Dependence of the sperm/oocyte decision on the nucleosome remodeling factor complex was acquired during recent Caenorhabditis briggsae evolution”). 3. We searched RNAseq data for sl2 sequence in clipped reads directly upstream of nurf1-2 without success. In C. briggsae and C. brenneri, we were able to identify three reads (2 and 1, respectively) that matched sl1 sequence. 4. In C. tropicalis, nurf-1-1 and nurf-1-2 are separated by ~10kb (Figure 6—figure supplement 3), which would be quite unusual for genes expressed in a operon. 5. In some species, nurf-1-1 and nurf-1-2 display different levels of expression, consistent with independent promoters being responsible for their expression (e.g. C. brenneri in Figure 6—figure supplement 3).

Finally, we note that even if the two genes are expressed in a single operon, this would not take away from the main parts of the story as they are free to evolve independently in the shared exon region.

3) Although some evidence of accelerated evolution of the overlap region between Yin and Yang isoforms is provided, there is no compelling evidence that there was an "adaptive conflict" that needed to be resolved and was indeed resolved. The null hypothesis of "both work, and developmental systems drift happens" cannot be rejected. This issue is at the core of your article's claim to biological importance.

First, we want to clarify with the reviewers (in case it was not clear from the 1st submission) that there are two separate arguments we are making as to why this the duplication might be beneficial. 1. Loss of non-functional (a and q) transcripts necessary for expressing both functional isoforms (b and d) using the shared exons. 2) Adaptive conflict in the shared exon region.

While we agree with reviewers that we cannot reject developmental drift and must rely on circumstantial evidence to make the case, we also note that this is almost always the case for any analysis of extant species. Ideally, we would have biochemical information about activity changes in the shared exon regions, however, this is out of scope for this paper. Besides providing evidence that the duplicated regions experience accelerated evolution, we believe that including the evidence that nurf-1 is targeted by laboratory evolution strengthens the case that variation in nurf-1 is under selection in natural environments as well. Additional evidence that nurf-1 is under selection in the wild was previously provided by the Ellis lab’s paper (Chen et al. – “Dependence of the sperm/oocyte decision on the nucleosome remodeling factor complex was acquired during recent Caenorhabditis briggsae evolution”).

We have rewritten the Discussion to make all of this clearer.

All three reviewers agreed that the paper has abundant experimental data of high quality. The problem is that these data are not convincingly linked to the explanations claimed, and the results seem to fall naturally into two distinct papers (one on domestication alleles, the other on the molecular biology and evolution of isoforms arising from the genetically complex nurf-1 locus). Unless the three points above can be addressed, this work may fare better as two shorter studies focused on these two topics.

We have attached the individual reviews by the three reviewers below.

Reviewer #1:

Using molecular biology, reverse genetics, and transcriptomics, Xu et al. make a compelling case for the evolution of two isoforms of nurf-1/BPTF in Caenorhabditis elegans that subdivide functions within this gene; they go on to show that a subgenus of Caenorhabditis species have evolved two entirely separate genes from these isoforms. Their observations provide a coherent explanation for many functions of the nurf-1 gene that had been intermittently seen over the years, but not reconciled. Their analysis of nurf-1 provides a beautiful instance of evolutionary adaptation, both in the sense of nurf-1's evolving into two functional isoforms, and in the sense that nurf-1 itself (as the authors show) mediates phenotypic tradeoffs in life history between early spermatocyte production and later oocyte production, which ultimately affects speed and output of the C. elegans reproductive life cycle that have different optima in different laboratory environments (plate culture for the N2 strain, versus liquid culture for the LSJ2 strain).

The experimental work described here is thorough, rigorous, and adroit.

The manuscript is very well composed. Background material is scholarly but appropriately concise. Generally, text is clean and direct. I was struck by the clarity with which the figures presented data visually. Papers such as this one that rely on genomic or transcriptomic analysis can be, in my experience, quite obtusely written or illustrated with figures that do not convey data well; Xu et al. have opposed these tendencies, for which I thank them.

I was impressed with the authors' efficient use of previously published RNA-seq data for heat shock expression, and of prepublication Nanopore long-RNA-seq data from Roach et al., 2019. By the time Xu et al. is published, I hope that the preprint Roach et al. can be cited as a published journal article.

In the text, I encountered only two points that I found difficult to decipher, a few missing data points, and one issue about nomenclature.

The first unclear point involved strain PTM417 versus strain PTM88. In Figure 1D, one of the strains shown is PTM417, but this is referred to as PTM88 in the figure legend. Reading the text carefully twice and reviewing the Materials and methods did not clarify for me which was the correct strain for Figure 1, although I suspect the correct strain is PTM417 rather than PTM88. In particular, the description of PTM417 in the Strains list (Materials and methods) is not consistent with the description of how PTM417 was constructed (subsection “CRISPR-generated allelic replacement lines (ARLs)”). I request that the authors entirely resolve this confusion.

The reviewer correctly points out the errors in the figure legend. PTM417 is a derivative of PTM88, constructed by backcrossing out the spe-9 mutation while leaving the 60bp deletion.

The second unclear point involved the "Normalized Size" assay shown in Figure 3C. Two readings of the manuscript and one careful reading of the Materials and methods did not clarify, for me, what was meant by 'size' (though my guess was that it was body size). I went back to their key reference for many methods (Large et al., 2016) and concluded that the authors actually mean "Normalized body size". This should be made completely clear to the reader, both in the y-axis labeling of Figure 3C and in the legend text for Figure 3C, as well as in the Materials and methods.

A figure legend for 3C was accidentally deleted from the submitted manuscript. Additionally, the Materials and methods section was poorly written and incorrect. The label and legend for Figure 3C and Materials and methods section have been fixed.

One missing data point is in Table 2. This table should list a stop-codon location for kah11 in the isoform NURF-1.F, and only in that one isoform. Instead, Table 2 currently gives no stop-codon mutations for kah11 at all. In addition, Figure 3D (which is supposed to show all stop codons in their structural context) also omits kah11. These omissions from both Table 2 and Figure 3D should be corrected.

Corrected.

A second missing data point appears to be in the genotype table of Figure 4B. For the genotype +/kah96, this table claims that there are 0 functional copies of the isoform nurf-1.d; however, I do not see how this can be possible, given that the wild-type allele [+] should encode at least 1 functional copy of nurf-1.d. Unless there is something that I am badly misunderstanding (always possible), please correct this.

Corrected.

About nomenclature: the authors have dubbed the NURF-1.B isoform "Yin" and the NURF-1.D isoform "Yang". I assume that this is because of their order of expression (NURF-1.B is expressed before NURF-1.D), so that calling them "Yin and Yang" follows their times of activity. However, I found the nomenclature confusing because Yang is associated with stereotypically masculine traits and entities, whereas Yin is associated with feminine ones. Yet, the authors' current nomenclature has Yin assigned to the isoform promoting spermatogenesis (NURF-1.B), while Yang is assigned to the isoform promoting oogenesis (NURF-1.D). Since I assume that the authors would like to make it *easy* for people to properly remember which isoform does what, I would strongly encourage them to switch their nomenclature (so that NURF-1.B gets called Yang, and NURF-1.D gets called Yin).

We agree with the reviewer and have modified the text accordingly.

Reviewer #2:

Here, Xu et al. report several splice isoforms of the C. elegans nurf-1 gene and show that they have different, possibly opposite, effects on certain aspects of gametogenesis. This has implications for fitness. The authors also found that the complex nurf-1 locus has undergone a partial duplication in several close relatives of C. elegans, thereby segregating distinct functions of nurf-1 into two separate genes.

1) The finding that gene expression profiles of N2 and ARL strains differ considerably at 52h, but apparently not at 60h is (Figure 1F) seems quite interesting. Surprisingly, it was not explored further. At least it should be commented on in a more elaborate way.

We have added an additional sentence in the Results (subsection “An N2-derived variant in the second intron of nurf-1 increases fitness and brood size in laboratory conditions”, last paragraph).

2) In the second paragraph of the subsection “The B and D isoforms have opposite effects on cell fate during gametogenesis”, the authors state that the observed reduction in the number of sperm is due to earlier sperm-to-oocyte switch. This is plausible, but other causes are also possible and I would encourage the authors to provide direct experimental support for this claim.

These experiments are rather laborious (DAPI staining and counting hundreds of sperm in ~50-60 animals). Since this point is not central to the paper, and in order to spend time addressing the main points brought up by reviewers above, we have not performed these experiments. We agree that other causes are possible and have added an additional sentence in the Results.

3) Are defects in sperm production of certain alleles of nurf-1 alleles (eg kah106) unique to hermaphrodites or are they seen in males as well? Conversely, are oocyte production defects of kah93 an issue in mutants that only produce oocytes? Addressing these questions could help to support the notion that different isoforms contribute to different aspects of the tradeoff between sperm and oocyte production. Are the observed defects in the numbers of gametes due to erroneous timing of the switch or some other problem? Can anything be said about the functions of different isoforms in gonochoristic species in which sperm vs. egg conflict is not a concern?

This is a very interesting question that we can address using a recently accepted paper in Genetics (https://www.genetics.org/content/early/2019/08/08/genetics.119.302462) which demonstrates a role for nurf-1 in determining sperm size. Using the n4295 allele (which removes much of the d isoform), we can demonstrate a role for the d isoform in playing a role in spermatogenesis (i.e. sperm size) in both males and hermaphrodites. However, the n4295 has an opposite effect on body-length in hermaphrodites (shorter) vs. males (longer). This indicates that nurf-1 has sex-specific roles, consistent with our central hypothesis that nurf-1 is a life history regulator and the fact that life history tradeoffs are influenced by sex. We would predict that the traits that nurf-1 controls will be sex and species specific (of course some will be shared). This is an area that we are interested in following up on.

4) The concluding sentence of the Results section claims that the rate of amino acid replacements has accelerated in the duplicated exons. I did not find a formal test supporting this assertion.

We have performed a test and have added this to the text.

5) Are the authors aware of this paper – Hughes, 1994?

We were not aware of this paper. We thank the reviewer for pointing this out and we have added this reference to the Introduction and Discussion.

Reviewer #3:

In this paper, Xu et al. present a meticulous examination of the nurf-1 locus of C. elegans. The study presents several interesting findings:

1) New evidence (beyond published studies) is presented indicating that nurf-1 has been a target of selection during domestication of laboratory strains, including the N2 and LSJ2 strains. In particular, the authors make a strong case for a major role of an intronic single nucleotide variant (SNV) in mediating adaptation to the NGM plate culture in the N2 lineage.

2) The authors then shift to a detailed characterization of the various nurf-1 transcripts and their necessity for growth and sustained fertility. This is most impressively supported by a battery of engineered stop codons. The results strongly suggest that the nurf-1.b and nurf-1.d transcripts, which overlap slightly, are the key effectors of nurf-1 function, and that the full-length nurf-1.a transcript is likely dispensable.

3) While isoform-restricted stop codons that reduce or eliminate nurf-1.b and nurf-1.d function both reduce self-fertility, they do so via opposite effects. nurf-1.b appears to be necessary to support robust spermatogenesis (i.e. it's loss leads to a partial Fog), while nurf-1.d has a role in promoting the switch from spermatogenesis to oogenesis (i.e. it is a partial Mog).

4) In the clade of Caenorhabiditis species that includes C. brenneri, the partially overlapping transcripts have been completely separated via a lineage-specific duplication of the exons that were historically shared.

Overall, this paper is a genetics tour de force. I do have a few suggestions, that if addressed, would tighten up the story:

1) While there appears to be a surprisingly major effect of the intronic SNV on both global gene expression and fitness, nothing is said about how this change in a homopolymeric run alters nurf-1's own gene expression. One might expect this to impact nurf-1.b transcription or splicing, and this could, in turn, alter NURF-1.B levels. Can the authors provide any data about the functional impact of the SNV on nurf-1? For example, an isoform frequency chart like that for Figure 2C, but comparing PTM228 and the ARL(intron, LSJ2>N2), would be very informative as to the mechanism by which the intron SNV impact phenotype.

As discussed above, we have added this analysis to the paper.

2) We are told that "brood size of C. elegans hermaphrodites is an important trait for evolutionary fitness in laboratory conditions." Indeeed it is, but the timing also matters, with early progeny much more valuable than late progeny. Looking at the reproductive schedules in Figure 1—figure supplement 1 (and also in the Large et al., 2016), there is a real shift in timing. Can the authors model, or at least speculate on, the expected impact of this?

This was sloppy on our part. We agree that both timing and brood size are a life-history tradeoff and both matter for fitness. Previous modeling (Cutter, 2004) and experimental work (Hodgkin, 1991) has suggested that N2 brood size balances this trade off. We make this clearer in the text.

3) In the species that have a fully separated nurf-1.1 and nurf-1.2 genes, there is very little space between the two. Have the authors looked to see if the 5' end of the transcript from the downstream gene is spliced to SL2? If so, that would indicate these have formed an operon.

As discussed above, we do not believe that these genes are expressed as an operon, although we are unable to conclusively say they are not.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Xu W, Long L, McGrath P. 2019. RNAseq of C. elegans under different genetic background and heat shock treatment to study the roles of different isoforms of nurf-1. NCBI Sequence Read Archive. PRJNA526473
    2. Jian Li, Laetitia Chauve, Grace Phelps, Renée M Brielmann, Richard I Morimoto. 2016. RNA-seq analysis in C. elegans larval development and heat shock. NCBI Sequence Read Archive. PRJNA321853
    3. Jessica Brunquell, Stephanie Morris, Yin Lu, Feng Cheng, Sandy D Westerheide. 2016. The genome-wide role of HSF-1 in the regulation of gene expression in Caenorhabditis elegans. NCBI Sequence Read Archive. PRJNA311958 [DOI] [PMC free article] [PubMed]

    Supplementary Materials

    Figure 1—source data 1. Source data for Figure 1.
    DOI: 10.7554/eLife.48119.006
    Figure 1—figure supplement 1—source data 1. Source data for Figure 1—figure supplement 1.
    DOI: 10.7554/eLife.48119.004
    Figure 2—figure supplement 1—source data 1. Source data for Figure 2—figure supplement 1.
    DOI: 10.7554/eLife.48119.010
    Figure 2—figure supplement 2—source data 1. Source data for Figure 2—figure supplement 2.
    DOI: 10.7554/eLife.48119.012
    Figure 3—source data 1. Source data for Figure 3.
    DOI: 10.7554/eLife.48119.019
    Figure 4—source data 1. Source data for Figure 4.
    DOI: 10.7554/eLife.48119.024
    Figure 5—source data 1. Source data for Figure 5.
    DOI: 10.7554/eLife.48119.027
    Figure 6—source data 1. Source data for Figure 6.
    DOI: 10.7554/eLife.48119.035
    Supplementary file 1. RNA-seq counts for each gene.
    elife-48119-supp1.xlsx (1.4MB, xlsx)
    DOI: 10.7554/eLife.48119.038
    Supplementary file 2. GO Category analysis for intron SNV regulon.
    elife-48119-supp2.xlsx (10.4KB, xlsx)
    DOI: 10.7554/eLife.48119.039
    Supplementary file 3. Guide RNAs for CRISPR-Cas9 genome edits.
    elife-48119-supp3.xlsx (11.5KB, xlsx)
    DOI: 10.7554/eLife.48119.040
    Transparent reporting form
    DOI: 10.7554/eLife.48119.041

    Data Availability Statement

    Sequencing reads were uploaded to the SRA under PRJNA526473.

    The following dataset was generated:

    Xu W, Long L, McGrath P. 2019. RNAseq of C. elegans under different genetic background and heat shock treatment to study the roles of different isoforms of nurf-1. NCBI Sequence Read Archive. PRJNA526473

    The following previously published datasets were used:

    Jian Li, Laetitia Chauve, Grace Phelps, Renée M Brielmann, Richard I Morimoto. 2016. RNA-seq analysis in C. elegans larval development and heat shock. NCBI Sequence Read Archive. PRJNA321853

    Jessica Brunquell, Stephanie Morris, Yin Lu, Feng Cheng, Sandy D Westerheide. 2016. The genome-wide role of HSF-1 in the regulation of gene expression in Caenorhabditis elegans. NCBI Sequence Read Archive. PRJNA311958


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES