Skip to main content
PLOS Genetics logoLink to PLOS Genetics
. 2020 Feb 24;16(2):e1008606. doi: 10.1371/journal.pgen.1008606

A spontaneous complex structural variant in rcan-1 increases exploratory behavior and laboratory fitness of Caenorhabditis elegans

Yuehui Zhao 1, Lijiang Long 1,2, Jason Wan 3, Shweta Biliya 1, Shannon C Brady 4, Daehan Lee 4, Akinade Ojemakinde 1, Erik C Andersen 4, Fredrik O Vannberg 1,5, Hang Lu 5,6, Patrick T McGrath 1,2,6,7,*
Editor: Harmit S Malik8
PMCID: PMC7058356  PMID: 32092052

Abstract

Over long evolutionary timescales, major changes to the copy number, function, and genomic organization of genes occur, however, our understanding of the individual mutational events responsible for these changes is lacking. In this report, we study the genetic basis of adaptation of two strains of C. elegans to laboratory food sources using competition experiments on a panel of 89 recombinant inbred lines (RIL). Unexpectedly, we identified a single RIL with higher relative fitness than either of the parental strains. This strain also displayed a novel behavioral phenotype, resulting in higher propensity to explore bacterial lawns. Using bulk-segregant analysis and short-read resequencing of this RIL, we mapped the change in exploration behavior to a spontaneous, complex rearrangement of the rcan-1 gene that occurred during construction of the RIL panel. We resolved this rearrangement into five unique tandem inversion/duplications using Oxford Nanopore long-read sequencing. rcan-1 encodes an ortholog to human RCAN1/DSCR1 calcipressin gene, which has been implicated as a causal gene for Down syndrome. The genomic rearrangement in rcan-1 creates two complete and two truncated versions of the rcan-1 coding region, with a variety of modified 5’ and 3’ non-coding regions. While most copy-number variations (CNVs) are thought to act by increasing expression of duplicated genes, these changes to rcan-1 ultimately result in the reduction of its whole-body expression due to changes in the upstream regions. By backcrossing this rearrangement into a common genetic background to create a near isogenic line (NIL), we demonstrate that both the competitive advantage and exploration behavioral changes are linked to this complex genetic variant. This NIL strain does not phenocopy a strain containing an rcan-1 loss-of-function allele, which suggests that the residual expression of rcan-1 is necessary for its fitness effects. Our results demonstrate how colonization of new environments, such as those encountered in the laboratory, can create evolutionary pressure to modify gene function. This evolutionary mismatch can be resolved by an unexpectedly complex genetic change that simultaneously duplicates and diversifies a gene into two uniquely regulated genes. Our work shows how complex rearrangements can act to modify gene expression in ways besides increased gene dosage.

Author summary

Evolution acts on genetic variants that modify phenotypes that increase the likelihood of staying alive and passing on these genetic changes to subsequent generations (i.e. fitness). There is general interest in understanding the types of genetic variants that can increase fitness in specific environments. One route that fitness can be increased is through changes in behavior, such as finding new food sources. Here, we identify a spontaneous genetic change that increases exploration behavior and fitness of animals in laboratory environments. Interestingly, this genetic change is not a simple genetic change that deletes or changes the sequence of a protein product, but rather a complex structural variant that simultaneously duplicates the rcan-1 gene and also modifies its expression in a number of tissues. Our work demonstrates how a complex structural change can duplicate a gene, modify the DNA control regions that determine its cellular sites of action, and confer a fitness advantage that could lead to its spread in a population.

Introduction

Structural variation, resulting in the removal, duplication, insertion, or rearrangement of large (> 50bp) genomic regions, makes up a significant component of natural genetic variation in many different species [16]. The largescale rearrangement of DNA can truncate genes, modify transcriptional regulatory regions, and/or increase gene dosage and expression. Consequently, structural variation can have profound, detrimental effects on phenotype, including a variety of human diseases [711]. However, structural variants are also thought to be important for adaptive evolution in natural populations [2, 1214] and domesticated plants and animals [15], including a number of examples that link structural changes to putative adaptive phenotypic variation [1620]. From an evolutionary perspective, these larger genomic changes are interesting for a number of reasons. A gene duplication creates a new genetic substrate for evolution to act on, and over long evolutionary timescale, can result in the creation of a paralogous gene [21, 22]. Inversion events can both change the chromatin state that a gene is found in and also suppress recombination events within the inverted region [23, 24]. Finally, structural changes can create incompatibilities between populations, contributing to speciation [14, 25].

For these reasons, it is desirable to understand how genomic rearrangements modify phenotype and spread through populations. However, determining the effect of naturally-occurring genomic rearrangement on phenotype and fitness is very difficult due to linkage of nearby mutations. Experimental evolution is a powerful approach to study adaptation in real time due to the lower rate of nucleotide diversity between the selected strains, aiding in the identification of causal mutations [2638]. These studies typically utilize microorganisms with short generation times such as E. coli or S. cerevisiae, elucidating the molecular basis of adaptation and profiling genome dynamics in evolving population under diverse laboratory settings. By identifying and studying causal genetic variants, important insights into beneficial mutations have been gained, such as their occurrence frequency, the complexity of their molecular basis, the role of contingency and genetic background into their effect, and their fitness effects in specific environments. A number of studies have demonstrated that genomic rearrangements can spread in these populations due to the actions of positive selection [28, 29, 3944]. In some of these experiments, gene duplicates are thought to facilitate the metabolism or transport of a limiting nutrient due to increased protein product responsible for a rate-limiting step of metabolism.

While these experiments have led to fundamental advances in our understanding of evolution in real time, it is desirable to perform similar experiments in multicellular organisms, with specialized tissues and the ability to respond to their environment using a nervous system. However, long-term adaptation studies are still less advanced in multicellular animals. In our lab, we use the nematode C. elegans to study the connection between genotype and phenotype. Compared to other species, C. elegans has a high-rate of spontaneous structural mutations, as inferred by their presence in mutation accumulation lines and laboratory strains [41, 4547]. In general, most of these structural changes are thought to be deleterious; they are purged in populations with higher effective populations sizes [46]. However, spread of copy number variants are also observed in animals carrying deleterious mutations, suggesting that positive selection also acts on copy number variants in certain contexts [41]. Structural changes are also common in wild strains of C. elegans, consistent with a role of structural variants being beneficial in certain natural environments [1, 48].

Here, we study two historical laboratory strains of C. elegans, called N2 and LSJ2. These two strains share the same hermaphrodite ancestor, which was isolated in 1951 from mushroom compost collected in Bristol, UK (Fig 1A). In 1958, descendants of this animal were split into two distinct lineages and cultured in different laboratory conditions. The N2 lineage grew on agar plates seeded with bacteria (standard conditions for a C. elegans genetics laboratory). After about two decades growing in this environment, this lineage was cryopreserved in Sydney Brenner’s lab and named as N2. After Sydney Brenner introduced C. elegans to the genetics research community, N2 became the standard reference strain used across the world [49, 50]. The second lineage was cultured in liquid, axenic media composed of liver and soy peptone extract as a food source for about fifty years before it was cryopreserved and named as LSJ2 [51]. In the time between their separation into two lineages and cryopreservation, approximately 300 mutations arose and fixed in either of the lineages [51]. Previous work has identified six causal mutations of these 300 that confer phenotypic change and competitive advantage in the conditions these mutations arose in [5156].

Fig 1. Competitive fitness measurements of N2*/LSJ2 RILs identifies an outlier RIL.

Fig 1

(A) Overview of the life history of two laboratory strains of C. elegans since their isolation from the wild in 1951 and subsequent split into two separate lineages around 1958. The standard reference N2 strain was cultured on agar plates seeded with E. coli bacteria until methods of cryopreservation were developed. LSJ2 was cultured in liquid, axenic media until 2009 when a sample of the population was cryopreserved. Resequencing of these strains identified ~300 genetic differences that fixed in one of the two lineages. (B) Schematic of two parental strains used in high-throughput analysis. N2* (or CX12311) is a near-isogenic line (NIL) containing ancestral alleles of two genes, glb-5 (chromosome V) and npr-1 (chromosome X) backcrossed from the CB4856 wild strain. Beneficial alleles in these two genes fixed in the N2 lineage; use of the N2* strain allows us to exclude the effects of these alleles from our studies. (C) Example data for three pairwise competition experiments used to quantify the fitness differences between two strains in laboratory conditions. Every odd generation, allele proportion is quantified using digital PCR and fluorescent hydrolysis probes (dots). These points are used to estimate the relative fitness of strain by fitting a haploid selection model to these points (line). In these conditions, outcrossing is expected to be very low or absent due to the lack of males in the initial population. (D) Relative fitness levels were measured for a panel of 89 RIL strains generated between N2* and LSJ2 by competing each RIL against N2* for seven generations. RILs were ordered by their average fitness value (3 replicates were performed for each). Parental strains were also assayed (N2* and LSJ2). RILhf (red) is highlighted for its unusually high fitness. (E) QTL mapping on the relative fitness differences between the RIL strains. A single significant QTL on the right arm of chromosome II, which overlaps the previously identified nurf-1 gene, was identified. Threshold line is significance level at p = 0.05 from a 1,000 permutation test.

In this study, we used recombinant inbred lines (RILs) created using the N2 and LSJ2 strains combined with quantitative trait loci mapping (QTL mapping) to non-biasedly identify any additional mutations between N2 and LSJ2 that confer competitive advantage in standard N2-like laboratory growth conditions. During these experiments, we identified a beneficial, spontaneous, and complex inversion/duplication mutation in the rcan-1 gene that occurred during the construction of the RILs. This complex genomic rearrangement results in the partial duplication and inversion of five different regions, simultaneously duplicating the rcan-1 coding region and modifying the upstream promoter regions. While the gene copy number of rcan-1 is duplicated, the changes in upstream regions result in an overall decrease in rcan-1 expression. Our work demonstrates how the initial mutational events that create gene duplicates can be complicated, result in unexpected changes in gene expression, and provide fitness increases that will result in its fixation in a population.

Results

A N2/LSJ2 recombinant inbred line (RILhf) with increased competitive fitness and exploration behavior than either parental strain

Previously, we developed an assay to estimate the competitive fitness difference between two strains. Briefly, two strains are directly competed against each other in standard laboratory growth conditions (i.e. a single agar plate seeded with the OP50 strain of E. coli bacteria). Initially, 10 L4 hermaphrodite larva from each strain are transferred to the first plate where they are allowed to eat and reproduce until their grandchildren reach the L1 stage. At this point, ~1000 L1 larva are transferred to a new plate to eat and reproduce until their progeny reach the L1 stage. Subsequently, each generation, ~1000 L1 larva are transferred to a new plate for a total of five to seven generations depending on the experiment. In these conditions, outcrossing is minimized due to the low spontaneous rate of males (confirmed by observing the populations before their transfer to a new plate). The proportion of each strain is then estimated every other generation by isolating genomic DNA from the mixed population and using digital PCR with detection by fluorescent hydrolysis probes targeted to a specific allele pair that distinguishes the two strains. In general, either a naturally-occurring genetic difference between the two strains or a CRISPR-edited silent mutation in the dpy-10 gene is used (listed in Materials and Methods). Finally, relative fitness is estimated by fitting a haploid model to the measured allele frequencies. This assay is a more direct measure of competitive fitness in laboratory conditions than growth rate, fecundity, or other fitness-proximal traits that are often used in C. elegans.

To determine if additional LSJ2/N2 fixed mutations can affect fitness in N2-like laboratory conditions, we used a previously described panel of 89 recombinant inbred lines (RILs) between the CX12311 and LSJ2 strains [51]. CX12311 is a near isogenic line that carries ancestral npr-1 and glb-5 alleles from the CB4856 Hawaiian wild strain introgressed into an N2 background (Fig 1B—henceforth referred to as N2*). Using N2* as a parental strain eliminates the fitness effect of the derived alleles of N2 npr-1 and glb-5 [56]. Using the competition assay described above (Fig 1C), we measured the competitive advantage of each of the RIL strains against the N2* strain. A bimodal distribution of relative fitness values was observed in the RIL strains, suggesting that a single genetic locus accounted for the majority of the variation in the RIL strains (Fig 1D and S1 Table). Using these measured evolutionary fitness values for QTL mapping, we identified a single significant QTL on the right arm of Chromosome II centered over the nurf-1 gene (Fig 1E). We previously have shown that nurf-1 contains two fixed mutations from both the N2 and LSJ2 lineages that each affect animal’s fitness in N2-like laboratory conditions [54, 57]. These results are consistent with our QTL analysis, suggesting that nurf-1 plays an important role in adapting to laboratory conditions.

Interestingly, we found that one of the 89 RILs, CX12348 (henceforth called RILhf−hf for high fitness), had significantly higher fitness than either of the LSJ2 or N2* parental strains, which we validated in an independent competition experiment (Fig 1D and Fig 2A). RILhf contained a mixture of DNA from both the N2* and LSJ2 parental strains, with N2* DNA on the left arm of chromosome I, the entire chromosomes II, III, and V, and portions of the X chromosome (Fig 2B). The higher fitness of RILhf strain could be caused by two possible reasons: 1) higher-order epistatic interaction between three or more of the 300 derived alleles or 2) a de novo beneficial mutation that occurred during construction of the RIL panel. We decided to focus on this unusual strain to determine the genetic basis of its higher fitness.

Fig 2. An outlier RIL with higher pairwise fitness and exploration behavior in laboratory conditions.

Fig 2

(A) The fitness advantage of RILhf was verified in an independent experiment (*: p < 0.05; **: p < 0.01—one-way ANOVA tests followed by Tukey’s honest significant difference test). (B) Left shows schematic of source DNA of RILhf (CX12348) from each parental strain. RILhf contains LSJ2 sequence on chromosome IV and parts of the chromosome I and the chromosome X. RILhf animals are more likely to be found in the center or outside of the bacterial lawn than at the borders than parental controls (LSJ2 not shown). (C) Exploration behavior differences were quantified by placing a single animal on a plate seeded with a circular lawn. After 16 hours, the amount of the plate that was explored was quantified by counting the number of grid squares with animal tracks within it. Each point represents data from a single animal. The RILhf explored more of the plate than either parental strain (NS: not significant; ***: p < 0.001. one-way ANOVA tests followed by Tukey’s honest significant difference test).

Wild strains of C. elegans, as well as the N2* and LSJ2 parental strains, feed in groups on the borders of bacterial lawns, a strategy known as social behavior [52]. While growing the RILhf strain on standard plates, we noticed that animals had a stronger propensity to explore the centers and regions outside of the bacterial lawns, causing an increased number of worms and tracks in the center and outside of the lawn (Fig 2B). It has been previously shown that increases in exploration behavior is caused by changes in the relative time C. elegans spends in roaming and dwelling states in the presence of O2 and other chemical gradients created by the bacteria [58, 59]. To quantify this behavioral difference, we modified a previously described exploration assay [60] to measure long-term (16hrs) exploration behavior in the presence of circular lawns (instead of uniform lawns used in standard assays). The RILhf strain explored a substantially larger fraction of the bacterial lawn than either of the parental strains (Fig 2C).

Mapping the causal mutation responsible for higher exploration behavior in RILhf

The change in exploration behavior is potentially an adaptive strategy for RILhf animals to increase their evolutionary fitness in the laboratory or it might be a pleiotropic effect of the underlying genetic basis of this fitness gain. Since this exploration trait is easier to assay than relative fitness, we first focused on mapping this phenotype using a bulk-segregant approach. We created two new small panels of 48 RILs between RILhf and either the N2* or LSJ2 parental strains and measured their exploration behavior (Fig 3A and 3B). An approximately equal number of RIL strains showed each parental phenotype, suggesting that this trait was controlled by a single locus. We grouped these strains into low or high exploration groups for each RIL panel and performed pooled genomic sequencing (~60x coverage) on the four groups that were created (Fig 3C). For each group, we estimated the allele frequency of each of the ~300 N2/LSJ2 genetic variants across the genome. In bulk-segregant analysis, the genetic loci which are not responsible for the exploration behavioral difference are expected to have approximately equal N2/LSJ2 allele proportion. The genetic loci that contribute to exploration behavioral difference are expected to show a larger difference pattern of N2/LSJ2 allele proportion. By analyzing the sequencing result, a large allelic imbalance between the pooled sequencing groups was observed in the center of chromosome III in the RILhf x LSJ2 panels (Fig 3D). This result is expected if a de novo mutation arose and fixed in the RILhf strain in the center of chromosome III (which contains the N2 haplotype for the entire chromosome). In this scenario, a similar imbalance would occur for the de novo, causal variant in the RILhf x N2* cross, however, because the two strains are largely identical on chromosome III, we could not observe it using the LSJ2/N2 SNVs. In addition to the center region of chromosome III, we also detected a large allele frequency difference on the center of chromosome V, suggesting that genetic variation on chromosome V also contributes to exploration behavior. However, the allelic imbalance on V was opposite as our expectation (i.e. the higher exploration group contained LSJ2 alleles on V while the RILhf strain contains N2 alleles on V). Since the chromosome III shows a stronger allelic imbalance signal than V (max = ~0.8 vs ~0.6) and goes in the expected direction, we focused on identifying the causal genetic variation in this region. However, it is possible that variation on chromosome V also contributes to exploration, although potentially in a manner unrelated to the RILhf phenotype.

Fig 3. Exploration behavior differences of the RILhf strain maps to the center of chromosome III.

Fig 3

(A) To map the changes in RILhf exploration behavior and fitness, we generated two panels of RILs (n = 48) between the RILhf and N2* strains or the RILhf and LSJ2 strains. (B) Each RIL was measured for exploration behavior using the assay shown in Fig 2C. Color coding shows how strains were combined into low (green) or high (orange) groups for bulk-segregant analysis. Uncolored RILs were not included. The histograms on the right displayed the distribution of RIL’s exploratory fraction. (9 replicates were performed for each RIL). (C) Overview of bulk-segregant approach using pooled genomic DNA to calculate LSJ2/N2 allele frequency. (D) Allele frequency for LSJ2/N2 genetic differences was calculated for each population. A large allelic frequency difference was observed on chromosomes III and V in the LSJ2/RILhf.

A de novo complex genomic rearrangement is identified in the rcan-1 gene

To determine if the RILhf strain contains any de novo mutations in the center region of chromosome III, we sequenced genomic DNA isolated from the RILhf, N2*, and LSJ2 strains using Illumina short read sequencing. Although we did not identify any de novo SNVs or small indels on chromosome III in the RILhf strain, we did identify a large increase in coverage (2x – 8x) in the rcan-1 gene, which is an ortholog of human Down Syndrome gene RCAN1 (Fig 4A) [61]. This coverage increase was detected in the high exploration groups of both RIL panels, consistent with this genetic change causing increased exploration behavior (Fig 4A). The increased sequencing coverage suggests that the rcan-1 gene region has been amplified in the RILhf strain.

Fig 4. A de novo, complex rearrangement in rcan-1 in the RILhf strain.

Fig 4

(A) Illumina resequencing of the RILhf strain identified an increase in coverage at the rcan-1 locus that was not present in either the N2* or LSJ2 parental strains. This increase in coverage was linked with high exploration in both RIL panels, consistent with a role in exploration behavior. We were unable to resolve the exact nature of the genetic change using the short reads. (B) A single ~34.5 kb read from an Oxford Nanopore Minion resolved the rcan-1 rearrangement. This read was aligned to the N2 reference using blastn. Alignments are numbered on the y axis. Alignment gaps can be caused by either poor sequence quality of the read, or by genomic rearrangements in the RILhf strain. The x-axis shows the position of the read relative to rcan-1. To resolve the junctions, we used chimeric reads from the Illumina resequencing in (A). (C) Dot plot of the rcan-1 rearrangement. The y-axis shows the reference sequence of rcan-1, and the x-axis shows the rearrangement in the RILhf strain. A total of six new junctions was observed, causing changes to the rcan-1 locus shown under the x-axis. Palindromic sequences at the 3rd intron of rcan-1 gene body are also shown.

While a simple gene duplication event would cause an increase in coverage, we observed a non-uniform change in coverage across the affected region. We also identified a large number of chimeric or split reads (reads which partially align to two unique locations) that mapped to multiple locations within the rcan-1 locus (S1 Fig). The sequence of these chimeric reads within these groups were consistent with each other, and suggest that at least five new fusions between DNA sequence has occurred in the rcan-1 region of the RILhf strain. In other words, the rcan-1 genetic change consists of multiple inversion and/or duplication events. To resolve the precise mutation, we first attempted to amplify the entire affected region using PCR without success. As a complementary approach, we sequenced the RILhf strain using an Oxford Nanopore sequencing MinION, a long-read single molecule sequencing device with reported read lengths that could resolve the complex rearrangement [62]. By selecting reads that mapped to the rcan-1 region, we identified a single, ~34.5 kb long read that spanned the entire rcan-1 region (Fig 4B and S1 Data). This read resolved the large structural changes of this complex genomic rearrangement. By combining this long Nanopore long read with the DNA fusion events predicted by the Illumina short read sequencing, we resolved the complex rearrangement into five unique tandem inversions interspaced within the rcan-1 locus (Fig 4C and S2S4 Data). This proposed rearrangement was consistent with other Oxford Nanopore reads that did not entirely span the rearrangement and resolved the coverage increase and chimeric reads from the Illumina short-read resequencing (S2 Fig and S3 Fig) as well as smaller PCR products that cover the new junctions (S4 Fig and S2 Table).

rcan-1 complex genomic rearrangement is linked to changes in fitness and exploration behavior

To determine if this rearrangement was responsible for the increases in exploration behavior and relative fitness of the RILhf strain, we created two near isogenic lines (NILs) by backcrossing the rcan-1 rearrangement from the RILhf strain into the N2* background (Fig 5A). Genomic DNA from these NILs was sequenced to confirm that LSJ2-derived DNA and RILhf-specific mutations besides the rearrangement were removed from both NILs (S3 Table). As expected, both of these NILs explored a higher fraction of the bacterial lawn (Fig 5B). Pairwise competition experiments between the NILs and the N2* strain also demonstrated that this rearrangement is associated with the increases in fitness (Fig 5C). Finally, we were interested in whether the rearrangement affected fitness-proximal traits such as body size, growth rate, or reproduction. We used a high-throughput COPAs worm sorter to demonstrate that the NIL animals were shorter than wild type controls (Fig 5D), indicating that at least one fitness-proximal trait (body length) was affected. However, we cannot say whether this difference in body length is responsible for the change in fitness. These data support a causal role for the rcan-1 rearrangement for both the competitive fitness advantage and the exploration behavioral changes of the RILhf strain. However, we do not exclude a role for additional genetic mutations in regulating these phenotypes, such as genetic variation on chromosome V suggested by the bulk-segregant analysis of exploration behavior.

Fig 5. The rcan-1 rearrangement is linked to the exploration and fitness differences of the RILhf.

Fig 5

(A) A schematic of the pedigree used to create two near isogenic lines (NIL) by backcrossing the RILhf strain to the N2* strain. (B) The two rcan-1 NIL strains showed a similar exploration fraction as the RILhf strain. (NS: Not significant; ***: p < 0.001. one-way ANOVA tests followed by Tukey’s honest significant difference test). (C) The relative fitness differences of the NIL strains are comparable to the RILhf strain. Strains are shown on the x-axis, and the relative fitness of strain 1 is shown on the y-axis. (NS: Not significant; **: p < 0.01; ***: p < 0.001. one-way ANOVA tests followed by Tukey’s honest significant difference test). (D) A high-throughput development assay was used to measure animal lengths for the N2*, RILhf, and two NIL strains. Each point is a biological replicate, with the y-axis indicating the normalized median length of a population of animals. Animal length (μm) measurements are normalized by regressing out the differences among experiments (see Materials and Methods). The lengths of both NILs are significantly different from the lengths of the N2* and RILhf strains (***: p < 0.001. one-way ANOVA tests followed by Tukey’s honest significant difference test).

The rcan-1 rearrangement is predicted to cause a number of changes to the rcan-1 gene. First, it creates two full-length versions of the rcan-1 coding region (Fig 4C). However, the upstream region for each is modified by an inversion event in between the two coding regions. The first and second copies of rcan-1 contain 857 and 1,725 bp of endogenous upstream sequence before the inversion event occurs. While the core promoter region is likely conserved in both of the full-length rcan-1 versions, enhancers and other regulatory regions are probably missing or perturbed in the rearranged region, which might cause decreased, increased, or ectopic expression of rcan-1 (e.g. in C. elegans, at least 3kb of upstream DNA is typically used to estimate the expression pattern of a given gene). Indeed, analysis of previously published ChIP-seq data [6365] identified 39 transcription factors that bind throughout the upstream region of rcan-1 (S5 Fig and S4 Table). Second, the second copy of the rcan-1 gene also contains a small inversion in the 3’ UTR region. This inversion could modify binding sites for small RNAs or other RNA-binding proteins that regulate the stability or translation of the mRNA product. Additionally, the 3’ end of the small inversion is fused to an upstream promoter region, consequently, the native transcriptional terminator is missing from the second full-length copy of rcan-1. Finally, two truncated copies of the rcan-1 gene are also created, containing the first two exons of the gene. It is possible that truncated peptides with novel C-terminal fragments are produced from these copies and modify wildtype phenotype, although they lack the PxIxIT motif encoded by the last exon of rcan-1 that is required for RCAN-1 to bind with Calcineurin/TAX-6 [66]. It is difficult to predict a priori which of these changes alone or in combination could cause the changes to exploration behavior and/or evolutionary fitness.

The rearrangement of rcan-1 decreases its expression but is not a loss-of-function allele

To gain insights into the transcriptional changes caused by the rearrangement, we used RNA-seq to compare the genome-wide expression differences between the two NIL strains and the N2* strain. Interestingly, the gene with the largest change in expression was rcan-1, indicating that the rearrangement decreased transcription of the rcan-1 gene by about 75% (Fig 6A, S6 Fig, and S5 Table). In other contexts, gene duplications can modify phenotype by increasing gene dosage and expression; for the rcan-1 rearrangement, this is not the case.

Fig 6. The rcan-1 rearrangement allele decreases expression of rcan-1.

Fig 6

(A) A volcano plot of expression differences between the rcan-1 NIL1 and N2* strains. RNA was isolated from synchronized, L4 animals. The gene with the largest and most significant expression decrease was rcan-1. Red: p<0.01, log2(Fold Change) > 1. Cyan: p<0.01, log2(Fold Change) < -1. (The list of differential expressed genes with significance are available in S5 Table). (B) Co-injection of wild-type rcan-1 promotors driving GFP with wild-type rcan-1 or rearranged rcan-1 upstream promoter regions driving mCherry created from the RILhf strain. Each dot represents the ratio of total GFP expression divided by total mCherry expression from a single animal. Fluorescence was also segmented into head or body expression and compared separately (***: p < 0.001. one-way ANOVA tests followed by Tukey’s honest significant difference test. The top comparison group refers to the head fluorescence signal of the two truncated rcan-1 promoters compared to head of wild-type rcan-1). (C) Representative fluorescence images of (B). The white arrow in the middle panel indicates the neurons that retain high level of expression of mCherry driven by Prcan-1-R1. The white arrow in the bottom panel indicates the expression of mCherry driven by Prcan-1-R2 in the head region. Scale bar is 100μm. (A: anterior; P: posterior). (D) Representative fluorescence images of the head showing a pair of interneurons with less affected expression. (These are different animals from C). The white arrow in the middle panel indicates the neurons that retain high level of expression of mCherry driven by Prcan-1-R1. Scale bar is 50μm. (A: anterior; P: posterior).

Because the transcriptional profiling only reports whole-body changes in total rcan-1 expression, we created fusions of fluorescent proteins to the modified upstream regions of the different rcan-1 versions. We cloned the entire region between the two full-length versions of rcan-1 in both directions to create a Prcan-1-R1::mCherry construct (reporting expression of the first full-length version of rcan-1 in the complex rearrangement) and a Prcan-1-R2::mCherry construct (reporting expression of the second full-length version of rcan-1 in the complex rearrangement). As a control, we also cloned the first 5,085 bp of the upstream region from N2 and fused it to both GFP and mCherry (Prcan-1-WT::GFP or Prcan-1-WT::mCherry). We then simultaneously co-injected Prcan-1-WT::GFP with Prcan-1-WT::mCherry, Prcan-1-R1::mCherry, or Prcan-1-R2::mCherry (Fig 6B). Using a microfluidic device combined with confocal microscopy, we imaged whole-body expression from both green and red channels (Fig 6C). As expected from a previous publication, we observed wild-type expression of rcan-1 in a variety of tissues, including neurons, pharyngeal cells, and hypodermal cells [61]. We first measured how the modified upstream regions affected whole-body expression by measuring the total amounts of GFP and mCherry signals from ~30 animals for each promoter construct (Fig 6B). Both of the constructs from the complex rearrangement drove less mCherry expression than the wild-type construct to different extents, with the first upstream rearrangement more affected. The effect of the different constructs on mCherry fluorescence levels was also tissue-specific, which we measured by quantifying the appropriate anatomical regions. For example, the head fluorescence was significantly more affected then the body fluorescence in both constructs (Fig 6B). Further, while the transcriptional reporter Prcan-1-R1 shows decreased fluorescence in the pharynx, fluorescence in two neurons are mostly unaffected. The cell bodies of these neurons are found in the retrovesicular ganglion and send a single process to the nerve cord. We tentatively identified these neurons as RIF or RIG. The reporter Prcan-1-R2 universally decreased the expression in the head (Fig 6D). Combined with the whole-body RNA-seq data, we suggest that rcan-1 expression is largely reduced in the RILhf strain because of changes to the upstream regions of the new versions of rcan-1. Potentially, there are cell-type specific changes in transcription, however, extrachromosomal arrays are composed of dozens to hundreds of copies of the promoter region that might not reflect the expression of the genomic promoter.

The above experiments suggest that the rcan-1 rearrangement could be beneficial because of a global reduction of rcan-1 transcription. However, an alternative hypothesis is simply that the rearrangement is beneficial because loss of rcan-1 activity is beneficial in laboratory conditions and the remaining residual expression is unrelated to the fitness of the animals. To test the second hypothesis, we used CRISPR-enabled genomic editing to delete rcan-1 in the N2* strain (S7 Fig). This knockout strain showed an intermediate phenotype between the N2* and the rcan-1 NIL strains in the modified exploration behavior (Fig 7A). When we competed this strain against the two rcan-1 NILs, we found that the rcan-1 rearrangement was substantially more fit than the rcan-1 deletion (Fig 7B). We also competed the strain containing the rcan-1 deletion against wild-type N2* and found no significant difference in fitness (Fig 7C). These data are consistent with the rearrangement producing residual or ectopic expression of rcan-1 that is necessary for the fitness gains. We attempted to rescue the RILhf exploration phenotype using a transgene created from a PCR product amplified from the wildtype rcan-1 region, however, this construct was unable to rescue the exploration behavior (S8 Fig). There are a number of putative explanations for this result. Potentially, the transgene does not fully recapitulate wildtype expression of rcan-1, lacking upstream elements required for expression in cells necessary for the changes in the exploration behavior. Alternatively, additional genetic variants in RILhf also promote exploration behavior independent of the rcan-1 rearrangement.

Fig 7. The rcan-1 rearrangement allele is not a loss of function allele but its complexity is necessary for fitness advantage and active exploration behavior.

Fig 7

(A) A large deletion of the rcan-1 coding region was created using CRISPR/Cas9 genomic editing of the N2* strain. The rcan-1 knockout modified exploration behavior but did not phenocopy the rcan-1 NIL strains (***: p < 0.001. one-way ANOVA tests followed by Tukey’s honest significant difference test). (B) Competition experiments demonstrated that a strain carrying an rcan-1 deletion allele was less fit than the rcan-1 NIL strain (NS: Not significant; **: p < 0.01; ***: p < 0.001. one-way ANOVA tests followed by Tukey’s honest significant difference test). (C) Competition experiments suggested that a strain carrying an rcan-1 deletion allele does not show fitness advantage when compete against rcan-1 wild-type strain. (NS: Not significant. Unpaired Mann-Whitney-Wilcoxon Test). (D) Plates seeded with uniform bacteria lawn that suppress aggregation behavior of N2* do not fully suppress fitness advantage of rcan-1 rearrangement allele. (*: p < 0.05. Unpaired Mann-Whitney-Wilcoxon Test). (E) F1 heterozygotes N2* x RILhf animals and N2* x rcan-1 NIL1 animals show significantly lower exploration fraction than RILhf and rcan-1 NIL1. (NS: Not significant; **: p < 0.01; ***: p < 0.001. one-way ANOVA tests followed by Tukey’s honest significant difference test).

While the changes in exploration behavior are genetically linked to the changes in fitness on laboratory plates, it is unknown whether the changes in exploration behavior are required for the fitness gains. To test this, we used plates seeded with a uniform bacteria lawn (UBL) across the entire plate. These plates lack the lawn borders that create O2 gradients and suppress aggregation behavior of N2* animals [56]. We competed RILhf against N2* on UBL plates and found that RILhf still showed a fitness advantage (Fig 7D), indicating that the behavioral change is not solely responsible for the fitness gains. We also tested whether the RILhf strain consumed more food than the N2* strain. We had previously found that a strain containing derived, beneficial alleles of npr-1 and glb-5 consumed more food on plates in an equal amount of time. However, the food consumption of the RILhf strain was statistically indistinguishable than the N2* strain (S9 Fig).

Finally, in order to explore the role between gene dosage of rcan-1 and exploration behavior, we assayed heterozygotes between the RILhf, NIL and rcan-1 deletion strains (Fig 7E). These experiments suggest that there is a strong relationship between rcan-1 dosage and exploration behavior, as the heterozygotes for each of these crosses were intermediate to the parental strains.

Discussion

In this study, we identified an outlier RIL with higher relative fitness than either parental strain. This RIL also displayed a new behavioral phenotype not seen in either parental strain, resulting in increased exploration activity on laboratory agar plates. By mapping this trait, we identified a tandem set of inversion/duplications in the rcan-1 gene that seemed to influence both the exploration behavior and relative fitness of this RIL in standard laboratory conditions. A complex genomic rearrangement affecting phenotype have been found in C. elegans before [67], however, this was created in response to a chemical mutagen. The genomic rearrangement of rcan-1 was unexpectedly complex and provides insight into how gene duplication and rearrangement can occur in microevolutionary timescales.

Gene duplicates are thought to be a primary source of genetic material for the generation of evolutionary novelty, however, it is unclear how duplicates can arise and then navigate an evolutionary trajectory from redundancy to a state where both copies are maintained by natural selection as paralogs [68]. Two major issues in understanding how new gene copies evolve are understanding how gene duplicates initially spread through a population and the evolutionary forces responsible for functional differences in the two copies. Our work here suggests how both can occur due to a single mutational event. While some models of gene duplication have focused on the role of masking deleterious mutations or the role of genetic drift and purifying selection in spreading gene duplicates in a population [69, 70], our results suggest that positive selection can also be involved. After isolation from the wild, evolutionary mismatch between C. elegans and its laboratory environment resulted in N2* being at a point away from an adaptive peak. One route to increase its fitness was by changing rcan-1 activity, which was accomplished by a complex genetic change that creates two duplicated copies of the rcan-1 coding region. This complex genomic rearrangement was created naturally during the short RILhf construction period (10 generations). We propose this genomic rearrangement occurred as a single genomic instability event, potentially caused by the replication stress or mis-annealing during Okazaki fragment processing in DNA replication. The complex rearrangement might be a unique repair result induced by an initial error that activated a DNA replication checkpoint and the DNA repair machinery [7173]. Although the RILhf strain shows a fitness advantage, the breeding pedigree of the RIL panel is designed to minimize fitness effects. We do not propose that the RILhf was selected for by positive selection despite its high fitness. However, our work demonstrates how the origin of new gene copies can provide a fitness advantage in new environments, where large functional changes to specific genes can be advantageous.

Our work also suggests how functional variation between two gene copies, a second major issue for understanding the evolution of paralogous genes, can arise in a single mutational step. The rearrangement of rcan-1 causes large-scale changes to upstream noncoding regions, 3’UTR regions, and the creation of two truncated versions of rcan-1 coding sequence. For short evolutionary timescales, this type of genetic variant could potentially access changes to gene function that would be difficult for a single SNV, insertion-deletion, or tandem duplication to cause. For example, besides changing the upstream promoter region that determines the exact levels and tissues the rcan-1 gene is expressed in, the complex rearrangement is also predicted to create an rcan-1 mRNA with a modified 3’ UTR, potentially modifying its translational regulation or mRNA stability. Due to the differences in promoter region and 3’ UTR, it is possible that the two copies of rcan-1 are not functionally redundant because they may have different expression levels or tissue-specific expression. It will be interesting to determine the precise amounts of protein that are produced by each copy and whether deleting each of these copies of rcan-1 has a negative effect on fitness.

Our data indicates that the rearrangement reduces expression of rcan-1. Further, analysis of heterozygotes suggests that exploration behavior is sensitive to gene dosage of rcan-1. Our working model is that changes in expression of rcan-1 is responsible for the changes to exploration behavior. Exploration is controlled by a distributed neural circuit [60, 74, 75]. Modifying rcan-1 activity in these neurons could be responsible for the behavioral changes.

rcan-1 encodes an ortholog of the human RCAN1 gene [66], which encodes a calcipressin family protein that inhibits the calcineurin A protein phosphatase [76]. In humans, RCAN1 plays an important role in human health; it has been proposed to be a key contributor to Down Syndrome phenotypes in patients with trisomy 21 [76, 77] and chronic overexpression of RCAN1 in mice results in phenotypes related to Alzheimer’s disease [78]. In C. elegans, rcan-1 is required for memory of temperature exposure through a tax-6/calcineurin-family and crh-1/CREB-dependent pathway [66]. Thermotaxis, however, is not predicted to be important for laboratory fitness, and it is likely that the rcan-1 rearrangement regulates other unknown aspects of C. elegans biology on which selection can act. Unlike the standard N2 strain, which is potentially more fit in laboratory environments due to its ability to consume more food than the N2* strain, a strain containing the rcan-1 rearrangement showed no difference in food consumption compared to the N2* strain. However, we found that animals that carry the rcan-1 rearrangement were shorter than the N2* strain. rcan-1 was previously shown to regulate body size using loss-of-function mutations [79]. It should be interesting to determine the exact phenotypes that are responsible for the gains in fitness in laboratory conditions. While an increasing number of causal genetic variants that modify phenotype and fitness are being identified, few examples demonstrating the exact phenotypes responsible for fitness changes have been worked out.

It will be interesting to study the continued evolution of a strain carrying this rearrangement, as it is unlikely that this strain has reached its adaptive peak in a single mutational step. Will additional beneficial mutations act through rcan-1? One possibility is that cis-regulatory mutations could fine tune the expression of each copy of rcan-1 in causal tissues. These mutations could act to further diversify the function of each copy of rcan-1. Alternatively, one of the duplicated rcan-1 copies could be subsequently lost, as seen in experimental evolution of poxviruses [40].

As long-read sequencing technology improves, the ability to identify complex structural variants similar to the one that we described here will increase. It will be interesting to see how often these types of variants survive the actions of purifying and positive selection to become common in natural populations of C. elegans and other animals.

Materials and methods

C. elegans growth conditions

Animals were cultivated on standard nematode growth medium (NGM) plates containing 2% agar seeded with 200 μL of an overnight culture of the E. coli strain OP50 [50]. Ambient temperature was controlled using an incubator set at 20°C. Strains were grown for at least three generations without starvation before any experiments were conducted.

Strains

The following strains were used in this study. For each figure, a list of strains used is included in S1 Table.

Near isogenic lines (NILs):

CX12311 (N2*)—kyIR1(V, CB4856>N2), qgIR1(X, CB4856>N2)

PTM413 (rcan-1 NIL 1) kahIR16(III, CX12348>N2), kyIR1(V, CB4856>N2), qgIR1(X, CB4856>N2)

PTM414 (rcan-1 NIL 2), kahIR17(III, CX12348>N2), kyIR1(V, CB4856>N2), qgIR1(X, CB4856>N2)

Recombinant inbred lines (RILs):

CX12311 –LSJ2 RILs: CX12312-19, CX12321-27, CX12346-52, CX12354-60, CX12362-66, CX12368-75, CX12381-88, CX12414-37, CX12495-99, CX12501-08, CX12510, CX12361

CX12311—CX12348 (RILhf) RILs: PTM378-397, PTM421-434, PTM494-503

LSJ2—CX12348 (RILhf) RILs: PTM435-478

CRISPR-generated knockout and barcoded strains:

PTM505: dpy-10 (kah83) II, rcan-1(kah183) III, kyIR1(V, CB4856>N2), qgIR1(X, CB4856>N2)

PTM288: kyIR1(V, CB4856>N2), qgIR1(X, CB4856>N2) dpy-10(kah83)II;

Extrachromosomal array strains:

PTM553 kyIR1(V, CB4856>N2), qgIR1(X, CB4856>N2), kahEx169[Prcan-1-WT::GFP 25ng/μL; Prcan-1-WT::mCherry 25ng/μL] Isolate 1.

PTM554 kyIR1(V, CB4856>N2), qgIR1(X, CB4856>N2), kahEx170[Prcan-1-WT::GFP 25ng/μL; Prcan-1-WT::mCherry 25ng/μL] Isolate 2.

PTM555 kyIR1(V, CB4856>N2), qgIR1(X, CB4856>N2), kahEx171[Prcan-1-WT::GFP 25ng/μL; Prcan-1-WT::mCherry 25ng/μL] Isolate 3.

PTM556 kyIR1(V, CB4856>N2), qgIR1(X, CB4856>N2), kahEx172[Prcan-1-WT::GFP 25ng/μL; Prcan-1-R2::mCherry 25ng/μL] Isolate 1.

PTM557 kyIR1(V, CB4856>N2), qgIR1(X, CB4856>N2), kahEx173[Prcan-1-WT::GFP 25ng/μL; Prcan-1-R2::mCherry 25ng/μL] Isolate 2.

PTM558 kyIR1(V, CB4856>N2), qgIR1(X, CB4856>N2), kahEx174[Prcan-1-WT::GFP 25ng/μL; Prcan-1-R2::mCherry 25ng/μL] Isolate 3.

PTM559 kyIR1(V, CB4856>N2), qgIR1(X, CB4856>N2), kahEx175[Prcan-1-WT::GFP 25ng/μL; Prcan-1-R1::mCherry 25ng/μL] Isolate 1.

PTM560 kyIR1(V, CB4856>N2), qgIR1(X, CB4856>N2), kahEx176[Prcan-1-WT::GFP 25ng/μL; Prcan-1-R1::mCherry 25ng/μL] Isolate 2.

PTM561 kyIR1(V, CB4856>N2), qgIR1(X, CB4856>N2), kahEx177[Prcan-1-WT::GFP 25ng/μL; Prcan-1-R1::mCherry 25ng/μL] Isolate 3.

PTM566 CX12348 kahEx185[50ng/ul Prcan-1::rcan-1; 45ng/ul pSM;5ng/ul pCFJ90]

PTM567 CX12348 kahEx185[50ng/ul Prcan-1::rcan-1; 45ng/ul pSM;5ng/ul pCFJ90]

Strain construction

To create the CX12311-CX12348 and LSJ2-CX12348 RILs, CX12311 males or LSJ2 males were crossed to CX12348 hermaphrodites. 96 F2 progeny (48 from CX12311 x CX12348; 48 from LSJ2 x CX12348) were cloned to individual plates and allowed to self-fertilize for 10 generations to create the inbred lines. One RIL line was lost from the LSJ2xCX12348 cross, creating 47 RILs.

To create the rcan-1 NILs (PTM413 and PTM414), CX12348 animals were backcrossed to CX12311 for 10 generations. Two completely independent sets of crosses were used to create two independent lines. Primers used to identify male animals containing the rearrangement were: 5’—gagacaatactctgatattagacgcacca -3’ and 5’–gctgacaccagcaatcattgttca -3’.

To create the rcan-1 deletion strain (PTM505), two sgRNAs targeting the 5’ region of rcan-1 and two sgRNAs targeting the 3’ end of rcan-1 were created: sgRNA1: 5’-atttggaagatcatctttac-TGG-3’; sgRNA2: 5’-agtgctgatcaatgatccat-TGG-3’; sgRNA3: 5’-cgtggcatttcaattgctga-TGG-3’; sgRNA4: 5’-tcacatggagatgaagggcg-TGG-3’. CoCRISPR [80] was used to simultaneously edit the dpy-10 and rcan-1 genes using the following injection mix: 50ng/μL Peft-3::Cas9, 10ng/μL dpy-10 sgRNA, 25ng/μL of each of the four rcan-1 sgRNAs, and 500nM dpy-10(cn64) repair oligonucleotide. This mix was injected into CX12311 animals and Dpy or Rol animals were singled and genotyped using PCR. An animal with the deleted sequence 5’-caatggatcattgatca…..cacgcccttcatctccat-3’ was identified.

To create the GFP/mCherry extrachromosomal lines (PTM553-PTM561), four constructs were created. Prcan-1-WT::GFP was created by amplifying the rcan-1 promoter from CX12311 genomic DNA using primers 5’-ctgGGCCGGCCtcggttcaaatacctcatgggaca-3’ and 5’- ttGGCGCGCCtttttgttgttaacttatagaaaaaatttcagcaacca-3’ and cloning it into the pSM-GFP backbone with restriction enzyme sites 5’-FseI and 3’-AscI. To create Prcan-1-WT::mCherry, the rcan-1 promoter was amplified from CX12311 genomic DNA using primers 5’-tcggttcaaatacctcatgggaca-3’ and 5’- tttttgttgttaacttatagaaaaaatttcagcaacca-3’ and a pCFJ90-mCherry backbone was amplified using primers 5’- attttttctataagttaacaacaaaaaAcaagtttgtacaaaaaagcaggct-3 and 5’- ccatgaggtatttgaaccgaatagcttggcgtaatcatggtcat-3’. The two fragments were assembled using HI-FI assembly (NEB E5520S). To construct the Prcan-1-R1::mCherry and Prcan-1-R2::mCherry plasmids, 5’- tttttgttgttaacttatagaaaaaatttcagca-3’ and 5’- gaaacgaaacaaggtgggtcc-3’ or 5’- tttttgttgttaacttatagaaaaaatttcagca-3’ and 5’- agcggacccaccttgtttc-3’ were used to amplify the rearranged promoters from CX12348 genomic DNA. These PCR products were cloned into a pCFJ90-mCherry backbone using HI-FI assembly. Concentrations of each plasmid are indicated for each strain in the strain description.

Competition experiment

Competition experiments were performed as described previously [56]. In the standard assays, 9 cm NGM plates were seeded with 300 μL of an overnight E. coli OP50 culture and incubated at room temperature for three days. In the competition experiment using uniform bacteria lawn (UBL), the 9cm UBL plates were made by pouring overnight E. coli OP50 culture onto the NGM plate to cover the whole plate. Excess culture on the plate was removed by pouring off and the plates were left at 20°C overnight for forming uniform bacteria lawn. In each competition experiment, ten L4 larvae from each strain were picked onto a single plate and cultured for five days. Animals were transferred to identically prepared NGM plates and subsequently transferred every four days. Depending on the experiment, five or seven total transfers were performed. For each transfer, animals were washed off the plates using M9 buffer and collected into 1.5 mL centrifuge tubes. The animals were then mixed by inversion and allowed to stand for approximately one minute to settle adult animals. 50 μL of the supernatant containing approximately 1000–2000 L1-L2 animals were seeded onto fresh plates. The remaining animals were concentrated and used for genomic DNA isolation. Genomic DNA was collected every odd generation using a Zymo DNA isolation kit (D4071). To quantify the relative proportion of each strain, a digital PCR assay was performed with custom TaqMan fluorescent-quenching probes (Applied Biosciences). Genomic DNA was digested with SacI or EcoRI for 30 min at 37 oC. The digested products were purified using a Zymo DNA cleanup kit (D4064) and diluted to approximately 1–2 ng/μL. Seven TaqMan probes were designed using ABI software that targeted WBVar00051876, WBVar00601322, WBVar00167214, WBVar00601493, WBVar00601538, dpy-10 (kah82), or tbc-10(kah185) (S6 Table). Digital PCR assays were performed using a Biorad QX200 digital PCR machine with standard probe absolute quantification protocol. The relative allele proportion was calculated for each DNA sample using the count number of the droplet with fluorescence signal (Eq 1). To calculate the relative fitness of the two strains using three or four measurements of relative allele proportion, we used linear regression to fit this data to a one-locus generic selection model (Eqs 2 and 3), assuming one generation per transfer.

P(A)t=No.AlleleANo.AlleleA+No.Allelea (1)
P(A)t=P(A)0WAAtP(A)0WAAt+(1P(A)0)Waat (2)
log(P(A)0P(A)tP(A)01P(A)0)=(log(WaaWAA))t (3)

The relative fitness value and Taqman assay information for each competition experiment are included in S1 Table.

Exploration behavioral assay

The exploration assays from Flavell et al. were modified to study exploration in the presence of circular lawns [60]. 35 mm Petri dishes were seeded with 150 μL OP50 E. coli Bacteria for 24 h before the start of the assay. Individual L4 hermaphrodites were placed in the center of the plate and cultivated in 20°C for 16 hours. The plates were placed on a grid that has 100 squares that cover the whole bacteria lawn. To calculate the exploration fraction, the number of full or partial squares that contained animal’s tracks out of bacteria lawn border was quantified. The number of full or partial squares that contain the bacteria lawn was also counted (about 94–96 grids). The exploration fraction was calculated (Eq 4).

Explorationfraction=No.gridscontainedtracksNo.gridscontainedbacterialawn (4)

Heterozygous exploration assay

Heterozygous F1 was created by mating PTM288 males with the other strain of interest at L4 stage for one day. Fertilized hermaphrodites were then singled into individual plates. After two days, L4 hermaphrodites were picked from plates where a lot of males were present, indicating successful mating. After the assay, these animals are individually lysed and genotyped at dpy-10(kah83)II site to confirm they are heterozygous. The genotyping primers: Forward primer: 5’–gtcagatgatctaccggtgtgtcac—3’, reverse primer: 5’–gtctctcctggtgctccgtcttcac– 3’.

rcan-1 rescue assay

A PCR fragment that covers 4.5kb upstream to 0.7kb downstream of rcan-1 was cloned using NEB Phusion Q5 PCR system (Forward primer: 5’–gctccatacgcgcatttcag– 3’, reverse primer: 5’–tcttctcgaagccgttcacc– 3’). The PCR product was purified and injected at 50ng/uL with 5ng/uL f pCFJ90 and 45ng/uL pSM. The exploration behavior fraction of the animals expressing mCherry was quantified using standard exploration behavior assay method.

Bulk-segregant analysis of exploration behavior

The exploration behavioral assays were performed on 48 CX12311-CX12348 RILs and 47 LSJ2-CX12348 RILs. In the CX12311/CX12348 RILs, 28 RILs with median exploration fraction less than 0.575 were assigned to the low exploration group and the 20 RILs with median exploration fraction greater than or equal to 0.575 were assigned to the high exploration group. In the LSJ2/CX12348 RILs group, the 17 RILs with median exploration fraction less than 0.620 were assigned to the low exploration group, the 20 RILs with median exploration behavior greater than or equal to 0.870 were assigned to high exploration group, and the rest of the RILs were excluded from further analysis. Genomic DNA from each RIL (100 ng) was isolated and pooled into the four described groups for whole-genome resequencing.

Whole-genome sequencing

Genomic DNA was isolated using Qiagen Gentra Puregene Kit (158667) following the supplementary protocol for nematodes. The genomic DNA was further purified using Zymo Quick-DNA kit (D4068). DNA libraries were prepared using an Illumina Nextera DNA kit (FC-121-1030) with indexes (FC-121-1011). The prepared libraries were sequenced at 35 bp or 150 bp paired-read using an Illumina NextSeq 500. The reads were aligned to reference genome using BWA-aligner v0.7.17 [81]. BAM files were deduplicated and processed using SAMtools v1.9 [82] and Picard[83] (http://broadinstitute.github.io/picard/). SNVs were called by Freebayes and annotated by SnpEff [84, 85]. Custom Python scripts using the pysam library (https://github.com/pysam-developers/pysam) were used to identify regions of the genome with a large number of clipped and chimeric reads. Reads depths were visualized using IGV [86]. The sequencing reads were uploaded to the SRA under BioProject PRJNA526525.

Oxford Nanopore long-read sequencing

Genomic DNA of CX12348 was isolated from animals grown on 8 9 cm NGM plates using Qiagen Gentra Puregene Kit (158667) following the supplementary protocol for nematodes. The genomic DNA was concentrated and purified using Zymo Quick-DNA kit (D4068). Size-selection to collect DNA fragments from 10 kbp– 50 kbp was carried out using a Blue-pippin. The sequencing library was prepared using 1D ligation kit (SQK-LSK108) following the standard protocol. DNA was repaired using the NEBNext FFPE Repair Mix (M6630). After DNA repair, end preparation was performed and the adapter was ligated. 600 ng prepared library was loaded in the Nanopore R9 flow cell in MinION sequencer. The standard 48 hours sequencing protocol was performed and approximately 5 Gb of sequencing data was generated. To resolve the structure of rcan-1 complex rearrangement, the FASTQ files were aligned to reference genome using BWA aligner. Reads that covered the rcan-1 gene region and contained a gap in alignment were fetched using pysam (https://github.com/pysam-developers/pysam). These reads were then mapped to rcan-1 using BLAST and visualized with matplotlib (https://matplotlib.org) to show the rearrangement events. The structure of the complex rearrangement was verified by using BWA and IGV to map the Illumina short reads or FlexiDot [87] to map and visualize the Oxford Nanopore reads. The sequencing reads were uploaded to the SRA under BioProject PRJNA526525.

RNA-seq and transcriptome analysis

CX12311, PTM413, and PTM414 were synchronized using alkaline-bleach to isolate embryos, which were washed with M9 buffer and placed on a tube roller overnight. Approximately 400 hatched L1 animals were placed on NGM agar plates for each strain and incubated at 20°C for 48 hours. The ~L4 stage animals were washed off for standard RNA isolation using Trizol. Four replicates for each strain were performed on different days. The RNA libraries were prepared using the NEB Next Ultra II Directional RNA Library Prep Kit (E7760S) following its standard protocol. The libraries were sequenced by Illumina NextSeq 500. The reads were aligned by HISAT2 using default parameters for pair-end sequencing. Transcript abundance was calculated using HTseq and then used as inputs for the SARTools [88, 89]. edgeR v3.16.5 was used for normalization and differential analysis[55[90]. The analysis result was shown in a volcano plot. CX12311 was treated as the wild type. The genes show significant differential expression in the volcano plot are under thresholds | log2(fold) | > 1 and FDR adjusted p-value < 0.01. Sequencing reads were uploaded to the SRA under BioProject PRJNA526525.

Imaging

The detailed steps of microfluidic device fabrication were previously reported [91]. For each experiment, about 100–150 animals were suspended in 1 mL of S Basal and delivered into using a syringe. Animals were immobilized using 1 mL of tetramisole hydrochloride (200 mM) (Sigma-Aldrich cas. 5086-74-8) in S Basal. Imaging were acquired on a spinning disk confocal microscope (PerkinElmer UltraVIEW VoX) with a Hamamatsu FLASH 4 sCMOS camera. Images of the animals were quantified using ImageJ. A region-of-interest (ROI) was drawn around the entire worm, and the mean intensity of the GFP and mCherry images were calculated across the ROI. Relative fluorescence intensity was calculated as (Mean Intensity of mCherry)/(Mean Intensity of GFP).

Food consumption assay

The experimental method was described previously [56]. In brief, The 24-well plates were prepared by pipetting 0.75mL NGM agar contain 25 μM FUDR and 1x Antibiotic-Antimycotic (ThermoFisher 15240062) to each well. Each well was seeded with 20μL of freshly cultured OD600 of 4.0 (CFU ~ 3.2×109/mL) E. coli OP50-GFP(pFPV25.1). The plates were dried in a fume hood and dried with air flow for 1.5hr. The fluorescence signal of OP50-GFP was quantified by area scanning protocol using BioTek Synergy H4 multimode plate reader. The synchronized L4 animals were placed in the wells in the first five columns and the last column is used as control column. Each well was placed with 10 animals, and the plate was incubated in a 20°C incubator for 18 hours and the fluorescence signal was quantified again as the ending time point. The relative food consumption amount was calculated using the equations reported previously [56].

High-throughput growth rate analysis

The high-throughput growth rate and brood size assays were performed as described previously [92]. In short, approximately 25 bleach-synchronized embryos were aliquoted into each well of 96-well plates, and fed 5 mg/mL HB101 bacterial lysate on the following day [93]. After 48 hours of growing at 20°C, a large-particle flow cytometer (COPAS BIOSORT, Union Biometrica, Holliston, MA) was used to sort three L4 larvae into each well of a 96-well plate with 50 μL of K medium with HB101 lysate (10 mg/mL) and Kanamycin (50 μM). Animals were grown for 96 hours at 20°C and were then treated with sodium azide (50 mM in M9). Animal number (n) and animal length (time of flight, TOF) were measured by the BIOSORT. For each well, animal growth was measured as the median length of the population, and brood size was measured as the number of progeny per sorted animal. The experiments were replicated in two independent assays, and the linear model with the formula (phenotype ~ assay) was applied to normalize the differences among assays [94].

Statistical test

The raw data are included in S1 Table. To assess statistical significance, we performed one-way ANOVA tests followed by Tukey’s honest significant difference test to correct for multiple comparisons or the Wilcoxon-Mann-Whitney nonparametric test for pairwise comparisons. NS: not significant; *: p < 0.05; **: p < 0.01; ***: p < 0.001.

QTL mapping

The average of the log2(w) of each N2*/LSJ2 RIL was used as phenotype with 192 previously genotyped SNPs. R/qtl was used to perform a one-dimensional scan using marker regression on the 192 markers. The genome-wide error rate (p = 0.05) was determined by 1000 permutations test[95].

List of key resources and reagents

The key resources and reagents used in this study are listed in S6 Table.

Supporting information

S1 Fig. Illumina reads mapped to the rcan-1 locus.

(A) IGV plot of illumina sequencing short reads align to rcan-1 genomic locations. (B) Chimeric reads align to rcan-1 genomic locations. Reads are from the resequencing of the N2*(CX12311), LSJ2, and RILhf (CX12348) strain. Besides an increase in coverage at the rcan-1 locus, a large number of chimeric reads (i.e. reads that partially map to two locations) were found in the RILhf strain. (Reads with grey color indicates they are normal reads (Pair orientations: LR); Reads with cyan color imply inversion (Pair orientations: LL); Reads with blue color imply inversion (Pair orientations: RR); Reads with green color imply duplication or translocation (Pair orientations: RL). Reads with red color have larger than expected inferred sizes.)

(TIF)

S2 Fig. Dot plot of the nanopore sequencing reads align to proposed rcan-1 rearrangement.

10 nanopore sequencing reads that overlapped the rcan-1 structural variant were used to generate a dot plot with proposed rcan-1 rearrangement.

(TIF)

S3 Fig. Illumina short sequencing reads aligned to the proposed rcan-1 structural variant.

Top: All reads aligned to the rcan-1 rearrangement. Bottom: Chimeric reads aligned to the rcan-1 rearrangement. The uniform coverage and lack of chimeric reads is consistent with the proposed structure of the rearrangement. (Reads with grey color indicates they are normal reads (Pair orientations: LR); Reads with cyan color imply inversion (Pair orientations: LL); Reads with red color have larger than expected inferred sizes. Reads with empty color have low mapping quality.)

(TIF)

S4 Fig. The PCR products include the rearranged regions.

Red arrows are the PCR products that include the rearranged regions. The detail information of the primers, the expected length and observed length in agarose gel of each PCR product is listed in S2 Table.

(TIF)

S5 Fig. Transcription factor binding regions at rcan-1 5’-UTR.

The green bars represent the transcription factor binding region. The red bars represent the two truncated promoter regions that drive full length of rcan-1 gene body in the complex rearrangement. The blue bar represents the highly occupied target region (‘HOT’). The figure is generated from Wormbase J-browser by adding the feature of transcription factor binding regions. The information of the transcription factors is listed in S4 Table.

(TIF)

S6 Fig. Volcano plot of rcan-1 NIL2 gene expression vs. N2*.

Red dots indicate genes with increased expression in rcan-1 NIL2 vs. N2* (p<0.01, log2(Fold Change) > 1). Cyan dots indicate genes with decreased expression in rcan-1 NIL2 vs. N2* (p<0.01, log2(Fold Change) < -1). The list of differential expressed genes with significance are available in S5 Table.

(TIF)

S7 Fig. Strategy for creating a knockout allele of rcan-1 using CRISPR/Cas9.

The position of two pairs of sgRNAs that target the 5’ and 3’ end of the rcan-1 coding region. The resulting deletion allele is shown as a blue box.

(TIF)

S8 Fig. Exploration fraction of rcan-1 rescue lines.

The RILhf animals were co-injected with 50ng/uL Prcan-1(4.5Kbps)::rcan-1 PCR product, 5ng/uL pCFJ90, and 45ng/uL pSM. The exploration fraction of the animals that express mCherry were measured.

(TIF)

S9 Fig. Food consumption assay of RILhf and rcan-1 NILs.

Relative food consumption of indicated strains. Each dot indicates one experimental replicate.

(TIF)

S1 Data. rcan-1_NanoporeReads.txt.

This file contains the sequence of the Oxford Nanopore reads (Fig 4B and S2 Fig) that overlap the structural variant in fasta format.

(TXT)

S2 Data. rcan-1_RearrangementSequence.txt.

This file contains sequence information of the proposed rcan-1 structural variant in fasta format.

(TXT)

S3 Data. rcan-1_RearrangementSequence.txt.

This file contains annotated gene and junction information for the structural variant in Genbank format.

(TXT)

S4 Data. rcan-1_RearrangementSequence.dna.

This file annotated gene and junction information for the structural variant in SnapGene format. It contains the primer information for study the structural variant. This file can be viewed by SnapGene software or SnapGene Viewer software (SnapGene Viewer is a free software).

(DNA)

S1 Table. Raw data.

This table includes the raw experimental data of Figs 17 and S8 Fig and S9 Fig.

(XLSX)

S2 Table. Rearranged junction sequences.

This table includes the junction sequences for the rcan-1 structural variant. The primer’s information and the information of each PCR product’s size are also included.

(XLSX)

S3 Table. NIL resequencing.

This table includes all genetic variants identified in the rcan-1 near isogenic lines (NILs).

(XLSX)

S4 Table. TF binding regions in 5 UTR.

This table summarizes the transcription factor binding information at rcan-1 5’ upstream region from Wormbase.

(XLSX)

S5 Table. NIL_RNA-Seq.

This table includes all gene expression data for rcan-1 NILs.

(XLSX)

S6 Table. Sequence information of TaqMan probes and summary of resources and reagents.

This table lists sequence information for the TaqMan fluorescent quenching probes used for competition experiments. This table also includes the information of key resources and reagents used in this study.

(XLSX)

Acknowledgments

We thank the Caenorhabditis Genetics Center for strains, Todd Streelman, Levi Morran, Chao Jiang, Wei Zhang, Will Ratcliff, Annalise Paaby, and members of the Streelman and McGrath lab for discussions, and WormBase.

Data Availability

All RNA-seq and resequencing files are available from the SRA database NIH BioProject PRJNA526525.

Funding Statement

This work was supported by NIH GM114170 (to P.T.M), a John N. Nicholson fellowship (to S.C.B), an NSF CAREER Award (to E.C.A.), and NIH NS096581, GM088333, AG056436 (to H.L.). The John Nicholson Fellowship URL is: https://www.tgs.northwestern.edu/funding/fellowships-and-grants/internal-fellowships/nicholson-fellowship.html. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Maydan JS, Lorch A, Edgley ML, Flibotte S, Moerman DG. Copy number variation in the genomes of twelve natural isolates of Caenorhabditis elegans. BMC Genomics. 2010;11:62 Epub 2010/01/27. 10.1186/1471-2164-11-62 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Katju V, Bergthorsson U. Copy-number changes in evolution: rates, fitness effects and adaptive significance. Front Genet. 2013;4:273 Epub 2013/12/26. 10.3389/fgene.2013.00273 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, et al. An integrated map of structural variation in 2,504 human genomes. Nature. 2015;526(7571):75–81. Epub 2015/10/04. 10.1038/nature15394 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Long E, Evans C, Chaston J, Udall JA. Genomic Structural Variations Within Five Continental Populations of Drosophila melanogaster. G3 (Bethesda). 2018;8(10):3247–53. Epub 2018/08/17. 10.1534/g3.118.200631 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Audano PA, Sulovari A, Graves-Lindsay TA, Cantsilieris S, Sorensen M, Welch AE, et al. Characterizing the Major Structural Variant Alleles of the Human Genome. Cell. 2019;176(3):663–75 e19. Epub 2019/01/22. 10.1016/j.cell.2018.12.019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Fuentes RR, Chebotarov D, Duitama J, Smith S, De la Hoz JF, Mohiyuddin M, et al. Structural variants in 3000 rice genomes. Genome Res. 2019;29(5):870–80. Epub 2019/04/18. 10.1101/gr.241240.118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Lupski JR. Genomic rearrangements and sporadic disease. Nat Genet. 2007;39(7 Suppl):S43–7. Epub 2007/09/05. 10.1038/ng2084 . [DOI] [PubMed] [Google Scholar]
  • 8.Chen JM, Cooper DN, Ferec C, Kehrer-Sawatzki H, Patrinos GP. Genomic rearrangements in inherited disease and cancer. Semin Cancer Biol. 2010;20(4):222–33. Epub 2010/06/15. 10.1016/j.semcancer.2010.05.007 . [DOI] [PubMed] [Google Scholar]
  • 9.Martin CL, Kirkpatrick BE, Ledbetter DH. Copy number variants, aneuploidies, and human disease. Clin Perinatol. 2015;42(2):227–42, vii. Epub 2015/06/05. 10.1016/j.clp.2015.03.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Rice AM, McLysaght A. Dosage sensitivity is a major determinant of human copy number variant pathogenicity. Nat Commun. 2017;8:14366 Epub 2017/02/09. 10.1038/ncomms14366 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Hieronymus H, Murali R, Tin A, Yadav K, Abida W, Moller H, et al. Tumor copy number alteration burden is a pan-cancer prognostic factor associated with recurrence and death. Elife. 2018;7 Epub 2018/09/05. 10.7554/eLife.37294 PubMed Central PMCID: PMC6145837. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Chain FJ, Feulner PG. Ecological and evolutionary implications of genomic structural variations. Front Genet. 2014;5:326 Epub 2014/10/04. 10.3389/fgene.2014.00326 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Fan S, Meyer A. Evolution of genomic structural variation and genomic architecture in the adaptive radiations of African cichlid fishes. Front Genet. 2014;5:163 Epub 2014/06/12. 10.3389/fgene.2014.00163 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Wellenreuther M, Merot C, Berdan E, Bernatchez L. Going beyond SNPs: The role of structural genomic variants in adaptive evolution and species diversification. Mol Ecol. 2019;28(6):1203–9. Epub 2019/03/06. 10.1111/mec.15066 . [DOI] [PubMed] [Google Scholar]
  • 15.Lye ZN, Purugganan MD. Copy Number Variation in Domestication. Trends Plant Sci. 2019;24(4):352–65. Epub 2019/02/13. 10.1016/j.tplants.2019.01.003 . [DOI] [PubMed] [Google Scholar]
  • 16.Dorshorst B, Molin AM, Rubin CJ, Johansson AM, Stromstedt L, Pham MH, et al. A complex genomic rearrangement involving the endothelin 3 locus causes dermal hyperpigmentation in the chicken. PLoS Genet. 2011;7(12):e1002412 Epub 2012/01/05. 10.1371/journal.pgen.1002412 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Cook DE, Lee TG, Guo X, Melito S, Wang K, Bayless AM, et al. Copy number variation of multiple genes at Rhg1 mediates nematode resistance in soybean. Science. 2012;338(6111):1206–9. Epub 2012/10/16. 10.1126/science.1228746 . [DOI] [PubMed] [Google Scholar]
  • 18.Durkin K, Coppieters W, Drogemuller C, Ahariz N, Cambisano N, Druet T, et al. Serial translocation by means of circular intermediates underlies colour sidedness in cattle. Nature. 2012;482(7383):81–4. Epub 2012/02/03. 10.1038/nature10757 . [DOI] [PubMed] [Google Scholar]
  • 19.Wang Y, Xiong G, Hu J, Jiang L, Yu H, Xu J, et al. Copy number variation at the GL7 locus contributes to grain size diversity in rice. Nat Genet. 2015;47(8):944–8. Epub 2015/07/07. 10.1038/ng.3346 . [DOI] [PubMed] [Google Scholar]
  • 20.Yassin A, Delaney EK, Reddiex AJ, Seher TD, Bastide H, Appleton NC, et al. The pdm3 Locus Is a Hotspot for Recurrent Evolution of Female-Limited Color Dimorphism in Drosophila. Curr Biol. 2016;26(18):2412–22. Epub 2016/08/23. 10.1016/j.cub.2016.07.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Ohno S. Evolution by gene duplication Berlin, New York,: Springer-Verlag; 1970. xv, 160 p. p. [Google Scholar]
  • 22.Taylor JS, Raes J. Duplication and divergence: the evolution of new genes and old ideas. Annu Rev Genet. 2004;38:615–43. Epub 2004/12/01. 10.1146/annurev.genet.38.072902.092831 . [DOI] [PubMed] [Google Scholar]
  • 23.Kunte K, Zhang W, Tenger-Trolander A, Palmer DH, Martin A, Reed RD, et al. doublesex is a mimicry supergene. Nature. 2014;507(7491):229–32. Epub 2014/03/07. 10.1038/nature13112 . [DOI] [PubMed] [Google Scholar]
  • 24.Tuttle EM, Bergland AO, Korody ML, Brewer MS, Newhouse DJ, Minx P, et al. Divergence and Functional Degradation of a Sex Chromosome-like Supergene. Curr Biol. 2016;26(3):344–50. Epub 2016/01/26. 10.1016/j.cub.2015.11.069 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Fuller ZL, Koury SA, Phadnis N, Schaeffer SW. How chromosomal rearrangements shape adaptation and speciation: Case studies in Drosophila pseudoobscura and its sibling species Drosophila persimilis. Mol Ecol. 2019;28(6):1283–301. Epub 2018/11/08. 10.1111/mec.14923 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Barrick JE, Yu DS, Yoon SH, Jeong H, Oh TK, Schneider D, et al. Genome evolution and adaptation in a long-term experiment with Escherichia coli. Nature. 2009;461(7268):1243 10.1038/nature08480 [DOI] [PubMed] [Google Scholar]
  • 27.Blount ZD, Borland CZ, Lenski RE. Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coli. Proc Natl Acad Sci U S A. 2008;105(23):7899–906. Epub 2008/06/06. 10.1073/pnas.0803151105 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Brown CJ, Todd KM, Rosenzweig RF. Multiple duplications of yeast hexose transport genes in response to selection in a glucose-limited environment. Mol Biol Evol. 1998;15(8):931–42. Epub 1998/08/27. 10.1093/oxfordjournals.molbev.a026009 . [DOI] [PubMed] [Google Scholar]
  • 29.Dunham MJ, Badrane H, Ferea T, Adams J, Brown PO, Rosenzweig F, et al. Characteristic genome rearrangements in experimental evolution of Saccharomyces cerevisiae. Proc Natl Acad Sci U S A. 2002;99(25):16144–9. Epub 2002/11/26. 10.1073/pnas.242624799 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Elena SF, Lenski RE. Evolution experiments with microorganisms: the dynamics and genetic bases of adaptation. Nat Rev Genet. 2003;4(6):457–69. Epub 2003/05/31. 10.1038/nrg1088 . [DOI] [PubMed] [Google Scholar]
  • 31.Gresham D, Desai MM, Tucker CM, Jenq HT, Pai DA, Ward A, et al. The repertoire and dynamics of evolutionary adaptations to controlled nutrient-limited environments in yeast. PLoS Genet. 2008;4(12):e1000303 Epub 2008/12/17. 10.1371/journal.pgen.1000303 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Kao KC, Sherlock G. Molecular characterization of clonal interference during adaptive evolution in asexual populations of Saccharomyces cerevisiae. Nat Genet. 2008;40(12):1499–504. Epub 2008/11/26. 10.1038/ng.280 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Kasahara T, Abe K, Mekada K, Yoshiki A, Kato T. Genetic variation of melatonin productivity in laboratory mice under domestication. Proc Natl Acad Sci U S A. 2010;107(14):6412–7. Epub 2010/03/24. 10.1073/pnas.0914399107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Levy SF, Blundell JR, Venkataram S, Petrov DA, Fisher DS, Sherlock G. Quantitative evolutionary dynamics using high-resolution lineage tracking. Nature. 2015;519(7542):181 10.1038/nature14279 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Orozco-terWengel P, Kapun M, Nolte V, Kofler R, Flatt T, Schlotterer C. Adaptation of Drosophila to a novel laboratory environment reveals temporally heterogeneous trajectories of selected alleles. Mol Ecol. 2012;21(20):4931–41. Epub 2012/06/26. 10.1111/j.1365-294X.2012.05673.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Ratcliff WC, Denison RF, Borrello M, Travisano M. Experimental evolution of multicellularity. Proc Natl Acad Sci U S A. 2012;109(5):1595–600. Epub 2012/02/07. 10.1073/pnas.1115323109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Rose MR. Artificial Selection on a Fitness-Component in Drosophila Melanogaster. Evolution. 1984;38(3):516–26. Epub 1984/05/01. 10.1111/j.1558-5646.1984.tb00317.x . [DOI] [PubMed] [Google Scholar]
  • 38.Stanley CE Jr., Kulathinal RJ. Genomic signatures of domestication on neurogenetic genes in Drosophila melanogaster. BMC Evol Biol. 2016;16:6 Epub 2016/01/06. 10.1186/s12862-015-0580-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Castagnone-Sereno P, Mulet K, Danchin EGJ, Koutsovoulos GD, Karaulic M, Da Rocha M, et al. Gene copy number variations as signatures of adaptive evolution in the parthenogenetic, plant-parasitic nematode Meloidogyne incognita. Mol Ecol. 2019;28(10):2559–72. Epub 2019/04/10. 10.1111/mec.15095 . [DOI] [PubMed] [Google Scholar]
  • 40.Elde NC, Child SJ, Eickbush MT, Kitzman JO, Rogers KS, Shendure J, et al. Poxviruses deploy genomic accordions to adapt rapidly against host antiviral defenses. Cell. 2012;150(4):831–41. Epub 2012/08/21. 10.1016/j.cell.2012.05.049 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Farslow JC, Lipinski KJ, Packard LB, Edgley ML, Taylor J, Flibotte S, et al. Rapid Increase in frequency of gene copy-number variants during experimental evolution in Caenorhabditis elegans. BMC Genomics. 2015;16:1044 Epub 2015/12/10. 10.1186/s12864-015-2253-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Lauer S, Avecilla G, Spealman P, Sethia G, Brandt N, Levy SF, et al. Single-cell copy number variant detection reveals the dynamics and diversity of adaptation. PLoS Biol. 2018;16(12):e3000069 Epub 2018/12/19. 10.1371/journal.pbio.3000069 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Lauer S, Gresham D. An evolving view of copy number variants. Curr Genet. 2019. Epub 2019/05/12. 10.1007/s00294-019-00980-0 . [DOI] [PubMed] [Google Scholar]
  • 44.Venkataram S, Dunn B, Li Y, Agarwala A, Chang J, Ebel ER, et al. Development of a comprehensive genotype-to-fitness map of adaptation-driving mutations in yeast. Cell. 2016;166(6):1585–96. e22. 10.1016/j.cell.2016.08.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Vergara IA, Mah AK, Huang JC, Tarailo-Graovac M, Johnsen RC, Baillie DL, et al. Polymorphic segmental duplication in the nematode Caenorhabditis elegans. BMC Genomics. 2009;10:329 Epub 2009/07/23. 10.1186/1471-2164-10-329 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Konrad A, Flibotte S, Taylor J, Waterston RH, Moerman DG, Bergthorsson U, et al. Mutational and transcriptional landscape of spontaneous gene duplications and deletions in Caenorhabditis elegans. Proc Natl Acad Sci U S A. 2018;115(28):7386–91. Epub 2018/06/27. 10.1073/pnas.1801930115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Yoshimura J, Ichikawa K, Shoura MJ, Artiles KL, Gabdank I, Wahba L, et al. Recompleting the Caenorhabditis elegans genome. Genome Res. 2019;29(6):1009–22. Epub 2019/05/28. 10.1101/gr.244830.118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Kim C, Kim J, Kim S, Cook DE, Evans KS, Andersen EC, et al. Long-read sequencing reveals intra-species tolerance of substantial structural variations and new subtelomere formation in C. elegans. Genome Res. 2019;29(6):1023–35. Epub 2019/05/28. 10.1101/gr.246082.118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Sterken MG, Snoek LB, Kammenga JE, Andersen EC. The laboratory domestication of Caenorhabditis elegans. Trends Genet. 2015;31(5):224–31. Epub 2015/03/26. 10.1016/j.tig.2015.02.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Brenner S. The genetics of Caenorhabditis elegans. Genetics. 1974;77(1):71–94. Epub 1974/05/01. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.McGrath PT, Xu Y, Ailion M, Garrison JL, Butcher RA, Bargmann CI. Parallel evolution of domesticated Caenorhabditis species targets pheromone receptor genes. Nature. 2011;477(7364):321–5. Epub 2011/08/19. 10.1038/nature10378 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.de Bono M, Bargmann CI. Natural variation in a neuropeptide Y receptor homolog modifies social behavior and food response in C. elegans. Cell. 1998;94(5):679–89. Epub 1998/09/19. 10.1016/s0092-8674(00)81609-8 . [DOI] [PubMed] [Google Scholar]
  • 53.Duveau F, Felix MA. Role of pleiotropy in the evolution of a cryptic developmental variation in Caenorhabditis elegans. PLoS Biol. 2012;10(1):e1001230 Epub 2012/01/12. 10.1371/journal.pbio.1001230 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Large EE, Xu W, Zhao Y, Brady SC, Long L, Butcher RA, et al. Selection on a Subunit of the NURF Chromatin Remodeler Modifies Life History Traits in a Domesticated Strain of Caenorhabditis elegans. PLoS Genet. 2016;12(7):e1006219 Epub 2016/07/29. 10.1371/journal.pgen.1006219 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.McGrath PT, Rockman MV, Zimmer M, Jang H, Macosko EZ, Kruglyak L, et al. Quantitative mapping of a digenic behavioral trait implicates globin variation in C. elegans sensory behaviors. Neuron. 2009;61(5):692–9. Epub 2009/03/17. 10.1016/j.neuron.2009.02.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Zhao Y, Long L, Xu W, Campbell RF, Large EE, Greene JS, et al. Changes to social feeding behaviors are not sufficient for fitness gains of the Caenorhabditis elegans N2 reference strain. Elife. 2018;7 Epub 2018/10/18. 10.7554/eLife.38675 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Xu W, Long L, Zhao Y, Stevens L, Felipe I, Munoz J, et al. Evolution of Yin and Yang isoforms of a chromatin remodeling subunit precedes the creation of two genes. eLife. 2019;8:e48119 10.7554/eLife.48119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.de Bono M, Tobin DM, Davis MW, Avery L, Bargmann CI. Social feeding in Caenorhabditis elegans is induced by neurons that detect aversive stimuli. Nature. 2002;419(6910):899–903. Epub 2002/11/01. 10.1038/nature01169 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Gray JM, Karow DS, Lu H, Chang AJ, Chang JS, Ellis RE, et al. Oxygen sensation and social feeding mediated by a C. elegans guanylate cyclase homologue. Nature. 2004;430(6997):317–22. Epub 2004/06/29. 10.1038/nature02714 . [DOI] [PubMed] [Google Scholar]
  • 60.Flavell SW, Pokala N, Macosko EZ, Albrecht DR, Larsch J, Bargmann CI. Serotonin and the neuropeptide PDF initiate and extend opposing behavioral states in C. elegans. Cell. 2013;154(5):1023–35. 10.1016/j.cell.2013.08.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Lee JI, Dhakal BK, Lee J, Bandyopadhyay J, Jeong SY, Eom SH, et al. The Caenorhabditis elegans homologue of Down syndrome critical region 1, RCN-1, inhibits multiple functions of the phosphatase calcineurin. Journal of molecular biology. 2003;328(1):147–56. 10.1016/s0022-2836(03)00237-7 [DOI] [PubMed] [Google Scholar]
  • 62.Tyson JR, O'Neil NJ, Jain M, Olsen HE, Hieter P, Snutch TP. MinION-based long-read sequencing and assembly extends the Caenorhabditis elegans reference genome. Genome Res. 2018;28(2):266–74. Epub 2017/12/24. 10.1101/gr.221184.117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Jänes J, Dong Y, Schoof M, Serizay J, Appert A, Cerrato C, et al. Chromatin accessibility dynamics across C. elegans development and ageing. Elife. 2018;7:e37344 10.7554/eLife.37344 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Araya CL, Kawli T, Kundaje A, Jiang L, Wu B, Vafeados D, et al. Regulatory analysis of the C. elegans genome with spatiotemporal resolution. Nature. 2014;512(7515):400 10.1038/nature13497 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Gerstein MB, Lu ZJ, Van Nostrand EL, Cheng C, Arshinoff BI, Liu T, et al. Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science. 2010;330(6012):1775–87. 10.1126/science.1196914 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Li W, Bell HW, Ahnn J, Lee SK. Regulator of Calcineurin (RCAN-1) Regulates Thermotaxis Behavior in Caenorhabditis elegans. J Mol Biol. 2015;427(22):3457–68. Epub 2015/08/02. 10.1016/j.jmb.2015.07.017 . [DOI] [PubMed] [Google Scholar]
  • 67.Itani OA, Flibotte S, Dumas KJ, Moerman DG, Hu PJ. Chromoanasynthetic genomic rearrangement identified in a N-ethyl-N-nitrosourea (ENU) mutagenesis screen in Caenorhabditis elegans. G3: Genes, Genomes, Genetics. 2016;6(2):351–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Innan H, Kondrashov F. The evolution of gene duplications: classifying and distinguishing between models. Nat Rev Genet. 2010;11(2):97–108. Epub 2010/01/07. 10.1038/nrg2689 . [DOI] [PubMed] [Google Scholar]
  • 69.Clark AG. Invasion and maintenance of a gene duplication. Proc Natl Acad Sci U S A. 1994;91(8):2950–4. Epub 1994/04/12. 10.1073/pnas.91.8.2950 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Lynch M, Force A. The probability of duplicate gene preservation by subfunctionalization. Genetics. 2000;154(1):459–73. Epub 2000/01/11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Neelsen KJ, Lopes M. Replication fork reversal in eukaryotes: from dead end to dynamic response. Nature Reviews Molecular Cell Biology. 2015;16(4):207 10.1038/nrm3935 [DOI] [PubMed] [Google Scholar]
  • 72.Polleys EJ, House NC, Freudenreich CH. Role of recombination and replication fork restart in repeat instability. DNA repair. 2017;56:156–65. 10.1016/j.dnarep.2017.06.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Kugelberg E, Kofoid E, Andersson DI, Lu Y, Mellor J, Roth FP, et al. The tandem inversion duplication in Salmonella enterica: selection drives unstable precursors to final mutation types. Genetics. 2010;185(1):65–80. 10.1534/genetics.110.114074 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Pradhan S, Quilez S, Homer K, Hendricks M. Environmental programming of adult foraging behavior in C. elegans. Current Biology. 2019;29(17):2867–79. e4. 10.1016/j.cub.2019.07.045 [DOI] [PubMed] [Google Scholar]
  • 75.Rhoades JL, Nelson JC, Nwabudike I, Stephanie KY, McLachlan IG, Madan GK, et al. ASICs Mediate Food Responses in an Enteric Serotonergic Neuron that Controls Foraging Behaviors. Cell. 2019;176(1–2):85–97. e14. 10.1016/j.cell.2018.11.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Fuentes JJ, Genesca L, Kingsbury TJ, Cunningham KW, Perez-Riba M, Estivill X, et al. DSCR1, overexpressed in Down syndrome, is an inhibitor of calcineurin-mediated signaling pathways. Hum Mol Genet. 2000;9(11):1681–90. Epub 2000/06/22. 10.1093/hmg/9.11.1681 . [DOI] [PubMed] [Google Scholar]
  • 77.Arron JR, Winslow MM, Polleri A, Chang CP, Wu H, Gao X, et al. NFAT dysregulation by increased dosage of DSCR1 and DYRK1A on chromosome 21. Nature. 2006;441(7093):595–600. Epub 2006/03/24. 10.1038/nature04678 . [DOI] [PubMed] [Google Scholar]
  • 78.Martin KR, Corlett A, Dubach D, Mustafa T, Coleman HA, Parkington HC, et al. Over-expression of RCAN1 causes Down syndrome-like hippocampal deficits that alter learning and memory. Hum Mol Genet. 2012;21(13):3025–41. Epub 2012/04/19. 10.1093/hmg/dds134 . [DOI] [PubMed] [Google Scholar]
  • 79.Li W, Choi T-W, Ahnn J, Lee S-K. Allele-Specific Phenotype Suggests a Possible Stimulatory Activity of RCAN-1 on Calcineurin in Caenorhabditis elegans. Molecules and cells. 2016;39(11):827 10.14348/molcells.2016.0222 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Arribere JA, Bell RT, Fu BX, Artiles KL, Hartman PS, Fire AZ. Efficient marker-free recovery of custom genetic modifications with CRISPR/Cas9 in Caenorhabditis elegans. Genetics. 2014;198(3):837–46. Epub 2014/08/28. 10.1534/genetics.114.169730 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. bioinformatics. 2009;25(14):1754–60. 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.http://broadinstitute.github.io/picard/index.html.
  • 84.Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv preprint arXiv:12073907. 2012. [Google Scholar]
  • 85.Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly. 2012;6(2):80–92. 10.4161/fly.19695 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G, et al. Integrative genomics viewer. Nature biotechnology. 2011;29(1):24 10.1038/nbt.1754 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Seibt KM, Schmidt T, Heitkam T. FlexiDot: highly customizable, ambiguity-aware dotplots for visual sequence analyses. Bioinformatics. 2018;34(20):3575–7. Epub 2018/05/16. 10.1093/bioinformatics/bty395 . [DOI] [PubMed] [Google Scholar]
  • 88.Varet H, Brillet-Guéguen L, Coppée J-Y, Dillies M-A. SARTools: a DESeq2-and edgeR-based R pipeline for comprehensive differential analysis of RNA-Seq data. PLoS One. 2016;11(6):e0157022 10.1371/journal.pone.0157022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Anders S, Pyl PT, Huber W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31(2):166–9. 10.1093/bioinformatics/btu638 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Chen Y, Lun AT, Smyth GK. Differential expression analysis of complex RNA-seq experiments using edgeR Statistical analysis of next generation sequencing data: Springer; 2014. p. 51–74. [Google Scholar]
  • 91.Lee H, Kim SA, Coakley S, Mugno P, Hammarlund M, Hilliard MA, et al. A multi-channel device for high-density target-selective stimulation and long-term monitoring of cells and subcellular features in C. elegans. Lab on a Chip. 2014;14(23):4513–22. 10.1039/c4lc00789a [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Evans KS, Brady SC, Bloom JS, Tanny RE, Cook DE, Giuliani SE, et al. Shared genomic regions underlie natural variation in diverse toxin responses. Genetics. 2018;210(4):1509–25. 10.1534/genetics.118.301311 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Garcia-Gonzalez AP, Ritter AD, Shrestha S, Andersen EC, Yilmaz LS, Walhout AJM. Bacterial Metabolism Affects the C. elegans Response to Cancer Chemotherapeutics. Cell. 2017;169(3):431–41 e8. Epub 2017/04/22. 10.1016/j.cell.2017.03.046 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Shimko TC, Andersen EC. COPASutils: an R package for reading, processing, and visualizing data from COPAS large-particle flow cytometers. PLoS One. 2014;9(10):e111090 Epub 2014/10/21. 10.1371/journal.pone.0111090 ; PubMed Central PMCID: PMC4203834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Broman KW, Sen S. A Guide to QTL Mapping with R/qtl: Springer; 2009. [Google Scholar]

Decision Letter 0

Harmit S Malik, Kirsten Bomblies

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

19 Sep 2019

* Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. *

Dear Dr McGrath,

Thank you very much for submitting your Research Article entitled 'A spontaneous complex structural variant in rcan-1 increases exploratory behavior and laboratory fitness of C. elegans' to PLOS Genetics. Your manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important topic but identified some aspects of the manuscript that should be improved.

We therefore ask you to modify the manuscript according to the review recommendations before we can consider your manuscript for acceptance. Your revisions should address the specific points made by each reviewer.

In addition we ask that you:

1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images.

We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

Please let us know if you have any questions while making these revisions.

Yours sincerely,

Harmit S. Malik

Associate Editor

PLOS Genetics

Kirsten Bomblies

Section Editor: Evolution

PLOS Genetics

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Review uploaded as attachment

Reviewer #2: In this manuscript, Zhao et al. characterize a structural rearrangement in the rcan-1 gene in C. elegans, and link this genetic change to alterations in foraging behavior and fitness. First, they use a panel of N2/LSJ2 RILs to investigate how genetic changes impact fitness, using a competitive fitness assay. They identify a single strain that appears to be an outlier and then show that this strain also displays in increase in exploratory foraging behavior. They then use genetic mapping and sequencing to show that this strain has a de novo genetic change in the rcan-1 gene that is linked to the fitness and foraging phenotypes. Zhao et al. characterize the genetic change in rcan-1 and show that it is a surprisingly complex genomic rearrangement that arose during generation of the RILs. Genetic studies show that the change in rcan-1 is causal of the fitness and foraging phenotypes. Moreover, they find that the rcan-1 genomic rearrangement alters expression levels of rcan-1.

Overall, this is an interesting and convincing paper that should be of broad interest to the readers of Plos Genetics. This detailed study of a new genomic rearrangement and its strong ties to fitness and behavioral changes provides new insights into how structural variants cause phenotypic variability. I suggest just a few ways in which the manuscript can be improved and indicate the degree to which this would be preferred (vs. necessary), in my view.

1. The authors characterize the structural rearrangement in rcan-1 through long read sequencing and confirm that the Illumina sequencing aligns well to the proposed rearrangement. They indicate that they were unable to PCR the entire region. If possible, I’d still suggest a series of smaller PCRs aimed at confirming their proposed rearrangement. I realize this is non-trivial given the repetitive nature of the rearrangement, but perhaps primers could be designed to span the new sequences at breakpoints (i.e. right around the chimeric reads). Overall, they already have strong evidence for their proposed rearrangement, but this structural variant is the central point of the paper and it would be preferable to have overwhelming confidence in their interpretation.

2. The authors provide a solid first pass analysis of how the rearrangement of rcan-1 causes the foraging and fitness phenotypes, by comparing the NILs to a null mutant and examining how the non-coding changes impact expression (using fluorescent reporters). However, this is obviously still not fully resolved. For example, do the two truncated rcan-1 copies have any functional roles? Is this simply a gene dosage effect (seems unlikely…)? There are further analyses that could provide insights: A) what is the phenotype of the heterozygous NIL? B) Do CRISPR induced frameshift mutations in each of the four ORFs that comprise the genomic rearrangement impact the NIL phenotype? If possible, I’d suggest further exploration of this topic, but I do not view it is absolutely necessary for publication (especially given the challenges of targeting these repetitive copies via CRISPR, etc).

Minor points:

1. Is the distribution of datapoints in Fig. 3B bimodal, as would be expected from a single causal variant? It is difficult to assess this as presented. Please provide a histogram or something to that effect.

2. The images in Fig. 6C are rather small and difficult to assess. I’d suggest larger, high-quality images that give the reader a more complete view of sites of expression. In addition, providing a well-labeled image for each of the promoter regions used would be helpful for future mechanistic studies of rcan-1 function.

Reviewer #3: In this manuscript, the authors investigate the genetic basis of increased fitness in a single C. elegans outlier recombinant inbred line (RIL) that has higher fitness than its parental strains. This higher fitness is correlated with increased exploration of a bacterial lawn. The authors provide compelling data that show that the major causative allele contributing to these phenotypes is a de novo complex structural rearrangement of the rcan-1 gene involving a series of tandem inversions/duplications that ultimately leads to reduced expression of rcan-1 due primarily to truncations of the rcan-1 5’ noncoding region. Though the paper does not explore the specific cellular mechanisms by which this rearrangement and rcan-1 affect fitness/exploration, it is a very interesting and clearly-written study based on solid data and is appropriate for PLOS Genetics. In particular, the genetic and long-read sequencing data to map the rearrangement and determine its structure are very strong. We suggest a few experiments that could strengthen the paper, but do not feel that these are critical for publication.

Considerations:

1. One of the interesting conclusions of the paper is that even though the rearrangement creates two full-length copies of the rcan-1 coding sequence, there is actually reduced rcan-1 expression (~25% normal total levels). This is supported by solid RNA-seq data and expression reporters (Fig. 6). Given that the 5’ regulatory regions of both intact rcan-1 ORFs are truncated, reduced expression makes sense. The simplest model is that the truncations remove important enhancers and these enhancers no longer function as well when present more distantly in the remaining intact 5’ regulatory region. However, the authors push the data further to argue that there are tissue-specific effects on expression and that the two orientations of the inverted and truncated 5’region have different effects on expression. These latter two conclusions are not as well supported by the existing data that are based on expression reporters of the truncated/inverted 5’ regulatory region in the two orientations (Fig. 6B-D). The problem is that these expression reporters are extrachromosomal arrays that are overexpressed and are often variable in structure and expression levels from line to line or even different animals of the same line and thus should not be used for the kinds of fine quantitative inferences the authors make, such as the head expression being more affected than the body, especially as these inferences are based on a very small number of lines. To be able to make such fine comparisons the authors should generate single-copy transgenes that are not prone to variability. Additionally, the images in Fig 6C&D do not clearly show the distinct and tissue-specific effects the authors claim, which could again be the result of analysis of a small number of selected lines. Higher magnification images with better quantitation of expression in specific cells of interest (such as the unaffected head neurons) could also help.

We suggest that the authors either repeat these expression experiments with single-copy transgenes or tone-down their conclusions (lines 337-349, especially lines 347-349). For the data in Fig. 6B, they should specify whether the ~30 animals analyzed come from one or more transgenic lines. Three lines of each are shown in the strain list, but it isn’t clear whether the data come from all three lines. In the raw data (Table S1), please list the strain and allele # of each animal analyzed as well.

2. The authors provide good evidence that the residual expression of rcan-1 in the rearrangement is important for increased fitness by showing that near-isogenic lines (NILs) carrying this rearrangement outcompete a rcan-1 null mutant (Fig. 6F). The rcan-1 null has increased exploration (Fig. 6E) compared to a WT control, but it is unclear whether the null mutant also has increased fitness. To test this, a good experiment would be to compete the rcan-1 null vs. the WT control. This experiment would further address the argument that reduced expression of rcan-1 increases fitness and whether fitness is indeed tightly correlated with exploration.

It is clear that some of the complexity of the rearrangement is necessary for its fitness advantage, but it is unclear how much of the complexity is necessary. The simplest model is that the two intact rcan-1 ORFs with their altered upstream regulatory regions would be sufficient – that reduced rcan-1 expression plus some residual expression (possibly tissue-specific) is sufficient. To test this, a strain could be generated with the two altered 5’ regions R1 and R2 driving the rcan-1 ORF (as single-copy transgenes to control expression levels) in the background of the rcan-1 null mutant. Would this strain fully recapitulate the phenotypes of the more complex rearrangement lines?

Other simple experiments could further probe the relationship between rcan-1 dosage and exploration/fitness and strengthen the conclusions. The rearrangement has about 25% normal total expression of rcan-1, though this may be higher in some cells and lower in others. What is the phenotype of a rcan-1(lf)/+ heterozygote that would be predicted to have 50% rcan-1 levels in all cells? What is the phenotype of the rearrangement when heterozygous? Is the rearrangement rescued (i.e. fitness and exploration decreased) by injection of WT rcan-1(+)?

3. The NILs are shown to be sufficient for increased exploration and fitness (Fig. 5). But do they fully account for the increased fitness of the original RILhf strain? To test this, a good experiment would be to compete the NILs with RILhf.

4. On several occasions, it is suggested that variation on chromosome V contributes to the exploration phenotype of RILhf (e.g. lines 190, 261). Though the LSJ2/RILhf RILS clearly show an effect of chromosome V (LSJ2 alleles on this chromosome are associated with increased exploration), it seems that variation on this chromosome can contribute to exploratory behavior, but it seems unlikely that such variation is contributing to the phenotype of RILhf since it has no LSJ2 alleles on chromosome V (perhaps RILhf would explore even more with LSJ2 alleles on chromosome V, but the assay used here is already maxed out so wouldn’t be able to resolve this). More likely, alleles on LSJ2 chromosome V have increased exploration independently of the rest of the RILhf genetic background, a possibility that could be easily tested if desired. However, this is really a fairly minor point and tangential to the paper. We suggest just making it clear in the writing that chromosome V variation may affect exploration, but not necessarily the exploration phenotype of RILhf.

5. Some simple investigations into the modified regulatory sequence would be interesting. Are there known transcription factor binding sequences or conserved sequences in the 5’ noncoding region that are lost in the truncations?

6. A deeper discussion of worm exploration behavior could enrich this manuscript. What is known about the circuitry controlling this behavior? Is rcan-1 expressed in specific relevant neurons (and expression lost in the R1 and R2 constructs)? Additionally, it is ultimately unclear how exploration relates to fitness and whether this correlation is causal or coincidental. Throughout the paper, we are led to believe that the altered exploration could lead to increased fitness by changing how worms feed (the exploration behavior is sometimes called a “foraging strategy” but perhaps more care should be taken with these terms). But late in the discussion (line 405), we learn that there is no difference in food consumption in strains carrying the rcan-1 rearrangement. An interpretation of this surprising result seems warranted. Does this mean that exploration is unrelated to food consumption (and fitness) or might there be other ways that exploration is related to fitness? The authors don’t need to have the answers or take a side, but it would be nice to have some conclusion since it was just confusing to provide this result with no further comment. (Perhaps not calling exploration a foraging strategy would be advisable, especially when increased exploration is considered “increased foraging activity” as in line 363).

7. Is the known functional connection of rcan-1 to tax-6/calcineurin important? If rcan-1 inhibition of calcineurin is relevant to its exploration and fitness phenotypes, then it might be expected that there is increased calcineurin activity in strains with the rearrangement, and a tax-6 mutation would suppress the increased exploration and fitness phenotypes of the rearrangement.

8. There’s a fair bit of discussion of the rcan-1 rearrangement being formed as an adaptation in response to selective pressures (lines 35-38, 368-385, 393-395). Though the rearrangement clearly increases fitness, it does not seem that it was selected for increased fitness, but rather was likely just formed and fixed accidentally. In fact, the way the RILs were made, there did not seem to be any obvious selection for fitness since about half of the lines have low fitness like their parent LSJ2 (Fig 1D). We recommend a more conservative discussion of these points.

Minor points:

1. The authors are to be commended for using box plots rather than bar graphs in the figures (and for providing the raw data in a supplemental file), but the figure legends should give n values for quick reference.

2. It is unclear why the authors measured animal length (Figure 5D). This is the only time we hear of this phenotype in the paper and it is unclear if it is considered to be an important contributor to increased fitness. As presented, the experiment doesn’t really seem to fit in the paper, but is just a random piece of data thrown in.

3. For Fig. 6A and S4, it would be nice to know which other genes have significantly altered expression (the green and red dots), and whether the same genes were affected in both NILs. This information is not easily derived from the raw data in Table S3.

4. The way the Fig 6A legend is written, it is unclear whether this figure shows combined data from the two NILs (“differences between the NIL and N2* strains”). Given that the figure itself is labeled NIL1, we presume that Fig 6A shows the data from NIL1 and Fig S4 shows the data from NIL2. Please make this clear.

5. The Fig 6A legend should say what the green and red dots mean. The Fig 6C&D legend should say what the arrowheads indicate. It would be best if all the images in Figs 6C and 6D are shown with the same anterior-posterior orientation and mention this orientation in the legend. It seems that the zoomed-in images in 6D show different animals than in 6C, but this should be made explicit.

6. In Figure 4A, the schematic for the gene position is difficult to interpret. It would be good to color code rcan-1 differently from pst-2 so it can be seen where these genes start and end, and show the direction of transcription of each gene (as in Fig 4C). In fact, we recommend using the same schematic in 4C and 4A so that they can be directly compared, especially for the 5’ noncoding region. A scale bar for Fig 4A would also help so we can more easily see how much of the 5’ region has increased coverage.

7. In Figure 6B, the significance bars for statistical comparisons are difficult to interpret, especially the top two levels of these comparisons. It is unclear which data sets are indicated as significantly different from each other. For instance, what do the two *** values at the top refer to? The head of WT compared to head of R1 and R2, the body of WT compared to the body of R1 and R2, etc? Or head, body, and total of WT compared as a combined group to head, body, and total of R1 and R2?

8. It would be interesting to see if there are differences in movement speed between the N2*, RILhf , and rcan-1 NILs. This may be linked to exploratory behavior.

9. The description of the structural changes in the rcan-1 rearrangement is useful but could be condensed. The paragraph at the bottom of page 9/top of page 10 repeats many of the same things said in the opening paragraph of page 9 where the rearrangement is first described.

10. There should be explanation for what the different colors mean in Figures S1 and S3.

11. The acronym “CNV” should be defined when first used in the abstract.

12. In Table S3, rcan-1 is written as rcn-1. Please correct this to facilitate searching for the data.

13. Line 93 should read: “in wild strains of C. elegans” instead of “in wild strains C. elegans.”

14. There is inconsistency on whether the rearrangement carries five or six tandem inversions (e.g. lines 25 and 213).

15. Line 366. The rearrangement is described as “similar” to a previously published rearrangement that causes another phenotype in C. elegans. However, it’s not clear what is meant by “similar” and this other rearrangement is in fact quite different. It consists of several duplications and a triplication, but no inversions, and causes a phenotype through increased gene dosage rather than decreased expression as reported here. Additionally, the other rearrangement seems to be formed by chromoanasynthesis in which the rearrangement is made by templated synthesis and the breakpoint junctions have short microhomologies. This seems quite different from the rearrangement described here, which does not seem to have any microhomologies at the breakpoints (though this could be explicitly mentioned). The only similarity seems to be that complex rearrangements can cause phenotypes, but we would argue that the rearrangements themselves are not similar at all.

16. Lines 383-385: the point of this sentence is unclear.

17. The paragraph in lines 407-412 seems overly speculative.

18. References 56 and 58 are the same (and can be updated now that the paper is published).

Reviewed (and signed) by Michael Ailion and Lews Caro

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: Yes: Michael Ailion and Lews Caro

Attachment

Submitted filename: Review.docx

Decision Letter 1

Harmit S Malik, Kirsten Bomblies

24 Dec 2019

* Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. *

Dear Dr McGrath,

Thank you very much for submitting your Research Article entitled 'A spontaneous complex structural variant in rcan-1 increases exploratory behavior and laboratory fitness of C. elegans' to PLOS Genetics. Your manuscript was fully evaluated at the editorial level and by independent peer reviewers. You will see that all three reviewers are happy with your revisions. However, I hope you will take their comments about the writing in some of the 'new' parts to heart- I agree with Reviewer 3 that some of the changes you have made are not as nicely written as the original pieces. This is your opportunity to smooth those rough edges out before your article is published. I hope you will take that opportunity to do so. Essentially, this is just that- an opportunity to slightly edit otherwise your article is 'accepted' in principle.

We therefore ask you to modify the manuscript according to the review recommendations before we can consider your manuscript for acceptance. Your revisions should address the specific points made by each reviewer.

In addition we ask that you:

1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images.

We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

Please let us know if you have any questions while making these revisions.

Yours sincerely,

Harmit S. Malik

Associate Editor

PLOS Genetics

Kirsten Bomblies

Section Editor: Evolution

PLOS Genetics

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have identified an interesting laboratory-derived strain (RILhf) with a spontaneous mutation that causes increased fitness and roaming behavior. My primary concern with the manuscript was that the claim that the rcan-1 rearrangement is causal for the increased fitness and roaming behavior needed to be strengthened. The authors have attempted to address this concern, and the revised Table S3 is excellent and addresses my questions about the backcrossed NILs.

However it is disappointing that rescue of RILhf with wildtype rcan-1 did not work, with respect to foraging behavior (Figure S8). In the author’s “response to reviewers” this experiment is interpreted clearly, but I have some concerns with the authors interpretation in the paper (lines 300-302). My concerns are as follows:

“potentially this indicates that the rearranged region causes ectopic expression necessary for the changes in foraging behavior…” – this is inconsistent with the rcan-1 CRISPR KO allele promoting foraging behavior.

“…or overexpression of rcan-1 decreases foraging behavior” – this is the opposite of what the data is showing, so perhaps a typo? Do the authors mean overexpression of rcan-1 might increase foraging behavior? Otherwise you would have expected the construct to rescue, since RILhf displays increased foraging behavior.

I think more likely is this particular transgenic experiment didn’t work, either for technical reasons, or because additional variations in RILhf promote foraging behavior independent of the rcan-1 rearrangement. Nevertheless, the manuscript deserves to be published in PLoS Genetics, as the rcan-1 rearrangement is a very complex mutation, and fully dissecting the genetics of the various rcan-1 duplications is outside the scope of this paper. Additionally, the findings are interesting and novel, and the experiments are performed carefully and not over-interpreted.

Typo in line 94: “in certain context”

Typo in lines 238-239: should delete “enhancers and other r promoter regions are probably present in the truncated promoters”

Reviewer #2: The authors have addressed all of my concerns.

Reviewer #3: I support publication of this paper, which has been significantly strengthened in this revision. The authors have done a very good job of responding to and addressing the majority of the critiques. Though I do not have major concerns, a few of my original points have not been fully addressed and I repeat them here. Also, many of the parts of the manuscript that were changed upon revision are not written with the same quality as the rest of the manuscript. I point out a number of clear grammar mistakes and typos, but suggest the authors look again at the writing in all their new or edited sections and work more carefully on these parts.

Points:

1. The authors have done a better job of toning down their conclusions about tissue-specific effects of the altered rcan-1 promoter regions in some places and pointing out some of the caveats of extrachromosomal arrays, but there are other places where they still explicitly state that there are tissue-specific effects: Fig 6 title (line 456), line 116, lines 277-279. I am fine with the authors suggesting there might be tissue-specific effects, but think they should not flat out state it as if it were demonstrably proven because the evidence is quite weak.

2. I strongly recommend that the authors not use the term “foraging behavior” for the exploration phenotype and not equate increased exploration with “increased foraging activity.” It isn’t exactly clear what the increased exploration of rcan-1 variants indicates and whether it is causally related to the increased fitness, but it does not seem clearly related to foraging for the rcan-1 variants as their increased exploration does not lead to increased food consumption. Thus, it would be more conservative to just call it “exploration” as that is what is being assayed. The word foraging means “searching for or obtaining food.” The evidence given is that rcan-1 does not affect “obtaining food” and I don’t think it is addressed whether the exploration of rcan-1 variants indicates “searching for food.” An alternative unexplored possibility is that the rcan-1 variants may simply have increased locomotory activity or decreased reversals. Though increased exploration may be a foraging strategy for other mutants, the evidence in this paper does not suggest it is for the rcan-1 variants. Thus, continued use of the term foraging to describe this behavior is both potentially confusing and misleading. Examples include: the short title of the paper, line 51, lines 168-172, line 254, lines 293-308, lines 315-316, lines 349-353, and several Figure legends, but there may be others.

3. Related to point 2, I think the food consumption experiment would fit better in the Results, in the same section with the experiment using uniform bacterial lawns, as a general consideration of the question of how exploration relates to foraging.

4. Though the authors now state that the rcan-1 rearrangement was unlikely to have been selected for its higher fitness, the sentences in lines 38-41 and 346-348 still make it sound that this rearrangement formed as an adaptation in response to selective pressures. I would tone these down.

5. I previously suggested that the paragraph in lines 371-376 is overly speculative and the authors responded by saying that they have toned this paragraph down, yet it reappears exactly the same as in the original.

6. I did not realize that body size is considered a fitness proximal trait, and am surprised that a smaller size would be associated with increased fitness (if anything, I would have thought the opposite). Thus, the experiment on body length could be better explained and motivated for naïve readers like me. For me, it still seems to come out of the blue, and the immediate segue from short body to increased fitness (line 230) is not intuitive at all.

7. In Figure 5D (body length), the y-axis is unclear. It shows “normalized body length” but the units are unclear and it is unclear what it is being normalized to – the legend says “controls” but it is unclear what these controls are and there are no apparent data for any control strains in the Table S1 source data file.

8. Figure 6C&D: please indicate anterior-posterior direction of these worms.

9. The rcan-1 rearrangements suggests a simple and interesting model for the evolution of gene duplicates that wasn’t clearly presented in the discussion of this phenomenon (lines 321-338). How do gene duplicates evolve from redundancy to become paralogs? The rcan-1 rearrangement suggests a model that by imperfectly duplicating or rearranging the regulatory regions of the duplicated genes, two new genes can be created in a single step that aren’t functionally redundant because they may have differential expression levels and differential tissue-specific expression. Thus, this mechanism might create paralogs instantaneously without having to first evolve through redundancy.

Typos/grammar:

line 49: understanding

line 167: causal

line 230: these data (should be plural)

line 237: should be “upstream sequence”

lines 237-242: run-on sentence (also kind of unclear, I have a feeling it may have changed in ways unintended)

line 238: r at end of line

line 281: fluorescence

line 307: solely

lines 317-318: I think the cited paper has just one rearrangement, so shouldn’t be plural

line 319: unclear antecedent of “this.” Just say: “The rcan-1 rearrangement”

line 335: grammar

line 337: RILhf

line 342: “single single-nucleotide” is awkward

line 378: described

line 378: should be “often” instead of “common”

line 466: compared

lines 467-468, 470-471: grammar problems. Also, say what the white arrows in the R2 panels are showing.

lines 460 & 1045: Table S5

lines 1053 & 1076: grammar problems

Table S3 title: says Table S2 by mistake

Figure S4: rearranged

Figure S7: “deletion” is written on top of “region”

Reviewed (and signed) by Michael Ailion.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: Yes: Michael Ailion

Decision Letter 2

Harmit S Malik, Kirsten Bomblies

11 Jan 2020

Dear Dr McGrath,

We are pleased to inform you that your manuscript entitled "A spontaneous complex structural variant in rcan-1 increases exploratory behavior and laboratory fitness of C. elegans" has been editorially accepted for publication in PLOS Genetics. Congratulations!

Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional accept, but your manuscript will not be scheduled for publication until the required changes have been made.

Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org.

In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager.

If you have a press-related query, or would like to know about one way to make your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics!

Yours sincerely,

Harmit S. Malik

Associate Editor

PLOS Genetics

Kirsten Bomblies

Section Editor: Evolution

PLOS Genetics

www.plosgenetics.org

Twitter: @PLOSGenetics

----------------------------------------------------

Comments from the reviewers (if applicable):

----------------------------------------------------

Data Deposition

If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website.

The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: 

http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-19-01356R2

More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support.

Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present.

----------------------------------------------------

Press Queries

If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org.

Acceptance letter

Harmit S Malik, Kirsten Bomblies

12 Feb 2020

PGENETICS-D-19-01356R2

A spontaneous complex structural variant in rcan-1 increases exploratory behavior and laboratory fitness of C. elegans

Dear Dr McGrath,

We are pleased to inform you that your manuscript entitled "A spontaneous complex structural variant in rcan-1 increases exploratory behavior and laboratory fitness of C. elegans" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Kaitlin Butler

PLOS Genetics

On behalf of:

The PLOS Genetics Team

Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom

plosgenetics@plos.org | +44 (0) 1223-442823

plosgenetics.org | Twitter: @PLOSGenetics

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Illumina reads mapped to the rcan-1 locus.

    (A) IGV plot of illumina sequencing short reads align to rcan-1 genomic locations. (B) Chimeric reads align to rcan-1 genomic locations. Reads are from the resequencing of the N2*(CX12311), LSJ2, and RILhf (CX12348) strain. Besides an increase in coverage at the rcan-1 locus, a large number of chimeric reads (i.e. reads that partially map to two locations) were found in the RILhf strain. (Reads with grey color indicates they are normal reads (Pair orientations: LR); Reads with cyan color imply inversion (Pair orientations: LL); Reads with blue color imply inversion (Pair orientations: RR); Reads with green color imply duplication or translocation (Pair orientations: RL). Reads with red color have larger than expected inferred sizes.)

    (TIF)

    S2 Fig. Dot plot of the nanopore sequencing reads align to proposed rcan-1 rearrangement.

    10 nanopore sequencing reads that overlapped the rcan-1 structural variant were used to generate a dot plot with proposed rcan-1 rearrangement.

    (TIF)

    S3 Fig. Illumina short sequencing reads aligned to the proposed rcan-1 structural variant.

    Top: All reads aligned to the rcan-1 rearrangement. Bottom: Chimeric reads aligned to the rcan-1 rearrangement. The uniform coverage and lack of chimeric reads is consistent with the proposed structure of the rearrangement. (Reads with grey color indicates they are normal reads (Pair orientations: LR); Reads with cyan color imply inversion (Pair orientations: LL); Reads with red color have larger than expected inferred sizes. Reads with empty color have low mapping quality.)

    (TIF)

    S4 Fig. The PCR products include the rearranged regions.

    Red arrows are the PCR products that include the rearranged regions. The detail information of the primers, the expected length and observed length in agarose gel of each PCR product is listed in S2 Table.

    (TIF)

    S5 Fig. Transcription factor binding regions at rcan-1 5’-UTR.

    The green bars represent the transcription factor binding region. The red bars represent the two truncated promoter regions that drive full length of rcan-1 gene body in the complex rearrangement. The blue bar represents the highly occupied target region (‘HOT’). The figure is generated from Wormbase J-browser by adding the feature of transcription factor binding regions. The information of the transcription factors is listed in S4 Table.

    (TIF)

    S6 Fig. Volcano plot of rcan-1 NIL2 gene expression vs. N2*.

    Red dots indicate genes with increased expression in rcan-1 NIL2 vs. N2* (p<0.01, log2(Fold Change) > 1). Cyan dots indicate genes with decreased expression in rcan-1 NIL2 vs. N2* (p<0.01, log2(Fold Change) < -1). The list of differential expressed genes with significance are available in S5 Table.

    (TIF)

    S7 Fig. Strategy for creating a knockout allele of rcan-1 using CRISPR/Cas9.

    The position of two pairs of sgRNAs that target the 5’ and 3’ end of the rcan-1 coding region. The resulting deletion allele is shown as a blue box.

    (TIF)

    S8 Fig. Exploration fraction of rcan-1 rescue lines.

    The RILhf animals were co-injected with 50ng/uL Prcan-1(4.5Kbps)::rcan-1 PCR product, 5ng/uL pCFJ90, and 45ng/uL pSM. The exploration fraction of the animals that express mCherry were measured.

    (TIF)

    S9 Fig. Food consumption assay of RILhf and rcan-1 NILs.

    Relative food consumption of indicated strains. Each dot indicates one experimental replicate.

    (TIF)

    S1 Data. rcan-1_NanoporeReads.txt.

    This file contains the sequence of the Oxford Nanopore reads (Fig 4B and S2 Fig) that overlap the structural variant in fasta format.

    (TXT)

    S2 Data. rcan-1_RearrangementSequence.txt.

    This file contains sequence information of the proposed rcan-1 structural variant in fasta format.

    (TXT)

    S3 Data. rcan-1_RearrangementSequence.txt.

    This file contains annotated gene and junction information for the structural variant in Genbank format.

    (TXT)

    S4 Data. rcan-1_RearrangementSequence.dna.

    This file annotated gene and junction information for the structural variant in SnapGene format. It contains the primer information for study the structural variant. This file can be viewed by SnapGene software or SnapGene Viewer software (SnapGene Viewer is a free software).

    (DNA)

    S1 Table. Raw data.

    This table includes the raw experimental data of Figs 17 and S8 Fig and S9 Fig.

    (XLSX)

    S2 Table. Rearranged junction sequences.

    This table includes the junction sequences for the rcan-1 structural variant. The primer’s information and the information of each PCR product’s size are also included.

    (XLSX)

    S3 Table. NIL resequencing.

    This table includes all genetic variants identified in the rcan-1 near isogenic lines (NILs).

    (XLSX)

    S4 Table. TF binding regions in 5 UTR.

    This table summarizes the transcription factor binding information at rcan-1 5’ upstream region from Wormbase.

    (XLSX)

    S5 Table. NIL_RNA-Seq.

    This table includes all gene expression data for rcan-1 NILs.

    (XLSX)

    S6 Table. Sequence information of TaqMan probes and summary of resources and reagents.

    This table lists sequence information for the TaqMan fluorescent quenching probes used for competition experiments. This table also includes the information of key resources and reagents used in this study.

    (XLSX)

    Attachment

    Submitted filename: Review.docx

    Attachment

    Submitted filename: ResponseToReviews.pdf

    Attachment

    Submitted filename: ResponseToReviews.docx

    Data Availability Statement

    All RNA-seq and resequencing files are available from the SRA database NIH BioProject PRJNA526525.


    Articles from PLoS Genetics are provided here courtesy of PLOS

    RESOURCES