Abstract
The considerable genome size variation in Arabidopsis thaliana has been shown largely to be due to copy number variation (CNV) in 45S ribosomal RNA (rRNA) genes. Surprisingly, attempts to map this variation by means of genome-wide association studies (GWAS) failed to identify either of the two likely sources, namely the nucleolus organizer regions (NORs). Instead, GWAS implicated a trans-acting locus, as if rRNA gene CNV was a phenotype rather than a genotype. To explain these results, we investigated the inheritance and stability of rRNA gene copy number using the variety of genetic resources available in A. thaliana — F2 crosses, recombinant inbred lines, the multiparent advanced-generation inter-cross population, and mutation accumulation lines. Our results clearly show that rRNA gene CNV can be mapped to the NORs themselves, with both loci contributing equally to the variation. However, NOR size is unstably inherited, and dramatic copy number changes are visible already within tens of generations, which explains why it is not possible to map the NORs using GWAS. We did not find any evidence of trans-acting loci in crosses, which is also expected since changes due to such loci would take very many generations to manifest themselves. rRNA gene copy number is thus an interesting example of “missing heritability”—a trait that is heritable in pedigrees, but not in the general population.
Keywords: ribosomes, 45S rRNA genes, natural variation, Arabidopsis thaliana
In eukaryotic genomes, 45S rRNA genes are arranged in clusters termed nucleolus organizer regions (NORs) (Long and Dawid 1980). After transcription by RNA polymerase I, the primary transcript is processed into 18S, 5.8S, and 25S rRNAs that, together with the 5S rRNA (encoded by a separate multi-copy gene), constitute the catalytic core of ribosomes (Chambon 1975; Long and Dawid 1980). In Arabidopsis thaliana, each 45S ribosomal RNA (rRNA) gene is over 10 kb long, and the genome contains hundreds of tandemly arrayed gene copies at the top of chromosomes 2 (NOR2) and 4 (NOR4) (Copenhaver et al. 1995; Copenhaver and Pikaard 1996a). Natural inbred lines (accessions) vary by well over 10% in genome size (Schmuths et al. 2004; Long et al. 2013), largely due to differences in 45S rRNA gene copy number (Davison et al. 2007; Long et al. 2013). However, besides pulsed-field electrophoresis studies in the accession Landsberg indicating that both NORs are similar in size, each spanning ∼3.5–4.0 Mb (Copenhaver and Pikaard 1996b), nothing is known about the specific contribution of each locus to the overall copy number variation (CNV) in 45S rRNA genes.
We previously carried out a genome-wide association study (GWAS) to investigate the genetics of both the variation in genome size and 45S rRNA gene CNV in a population of A. thaliana lines from Sweden. We expected to find significant associations in cis – due to strong linkage disequilibrium between NOR haplotypes and closely linked single nucleotide polymorphisms (SNPs). Surprisingly, the scans identified neither of the two NORs. Instead, the analyses found an association in trans on chromosome 1, as if rRNA gene copy number were a phenotype rather than a genotype (Long et al. 2013).
Alternatively, repeat number may change too rapidly to be mapped using GWAS, but may still be inherited stably enough to be mapped in crosses (Long et al. 2013). Consistent with this, quantitative trait locus (QTL) analyses aimed at understanding the genetics behind NOR methylation in A. thaliana have suggested that CNV at the NORs themselves accounts for some of the methylation variation (Riddle and Richards 2002, 2005). Indeed, rapid changes in 45S rRNA gene copy number have been detected for several species. Examples range from a ∼twofold variation in copy number after 400 generations in fruit fly lines and nematodes (Averbeck and Eickbush 2005; Bik et al. 2013) or a similar 2.5-fold variation after only 70 generations in maize lines (Phillips 1978), to differences >fourfold after 90 generations in water flea lines (McTaggart et al. 2007) or even greater than twofold changes across siblings in humans (Gibbons et al. 2015) or sevenfold changes among individual siblings of a self-pollinated faba bean parent (Rogers and Bendich 1987). In light of the various degrees of instability in rRNA gene copy number displayed by higher plants (Walbot and Cullis 1985), it is relevant to investigate how rapidly the number of rRNA genes changes in A. thaliana.
Our aim in this study was threefold: first, to test if the trans association detected by GWAS (Long et al. 2013) has an effect in a segregating F2 population; second, to confirm that CNV in rRNA genes can be mapped to the NORs themselves in crosses; third, to investigate how copy number in rRNA genes of A. thaliana changes on a generational time scale.
Materials and Methods
DNA extraction and library preparation
We harvested leaves from ∼3-wk-old plants grown under long day conditions (16 hr light and 8 hr at 10°). We extracted DNA in 96-well plates with the NucleoMag 96 Plant (Macherey-Nagel) kit according to the manufacturer’s instructions.
We prepared libraries using a slightly modified version of the Illumina Genomic DNA Sample preparation protocol. Briefly, 100–200 ng of DNA was fragmented by sonication with Bioruptor (Diagenode). End-repair of sheared DNA fragments, A-tailing, and adapter ligation were done with Spark DNA Sample Prep Kit (Enzymatics). NEXTflex-96 DNA Barcodes (Bioo Scientific) were used to attach indexes to the sample insert during adapter ligation. Size selection, with median insert size ∼400 bp, and library purification were performed with Agencourt AMPure XP Beads (Beckman Coulter). Paired-end (PE) DNA libraries were amplified by PCR for 10–12 cycles. After PCR enrichment, libraries were validated with Fragment Analyzer Automated CE System (Advanced Analytical) and pooled in equimolar concentration for 96X-multiplex. Libraries were sequenced on Illumina HiSeq 2000 Analyzers using manufacturer’s standard cluster generation and sequencing protocols in 100 bp PE mode at the Vienna Biocenter Core Facilities next generation sequencing (NGS) unit in Vienna, Austria (http://www.vbcf.ac.at).
Genotyping by sequencing
For each segregating F2 or recombinant inbred line (RIL) population analyzed in this study (1002 × 6244, 6106 × 6071, 8426 × 6193, 6911 × 7213) we applied the following pipeline separately. We extracted both known indels and biallelic homozygous SNPs of the parental accessions from the 1001 Genomes Consortium (1001 Genomes Consortium 2016) with SelectVariants from Genome Analysis Toolkit (GATK; v3.5) (DePristo et al. 2011; Van der Auwera et al. 2013). We combined only segregating SNPs between parental accessions in a single variant call format (VCF) file with GATK/CombineVariants for later genotyping of individual samples (see below).
For each low-coverage sample we mapped PE reads to the A. thaliana TAIR10 reference genome with BWA-MEM (v0.7.4) (Li and Durbin 2009; Li 2013). We used Samtools (v0.1.18) to convert file formats, sort and index bam files (Li et al. 2009), while to remove duplicated reads we used Markduplicates from Picard (v1.101) (http://broadinstitute.github.io/picard/). We performed local realignment around indels by providing to the GATK/RealignerTargetCreator function known indels from the parental accessions to generate the set of intervals required by the GATK/IndelRealigner function. We called SNPs at the segregating sites determined in the combined VCF of the parental accessions with GATK/UnifiedGenotyper in genotyping mode with parameters “-glm SNP -gt_mode GENOTYPE_GIVEN_ALLELES -stand_call_conf 0.0 -G none -out_mode EMIT_ALL_SITES.”
For the construction of individual genetic maps we binned marker SNPs in 100 kb windows using R software with help of the package R/xts (Ryan and Ulrich 2014; R Core Team 2014). We discarded windows with either <100 segregating SNPs or <40 called SNPs: the former for considering them regions of low diversity between parental accessions, while the latter for considering them regions not well supported by reads. We assigned genotype “A” or genotype “B” to windows with >90% of SNP calls for the maternal or paternal accessions, respectively. We determined as genotype “H” windows with either >25% heterozygous calls or where the absolute difference between maternal and paternal SNP calls were <30%.
Estimating rRNA gene copy number through NGS
For each individual, we mapped all reads separately to a single reference 45S rRNA gene (extracted from GenBank: CP002686.1 coordinates 14195483-14204860; Supplemental Material, File S1) and to the A. thaliana TAIR10 reference genome as described in the section Genotyping by sequencing. For our reference 45S rRNA gene (File S1), we based the annotations of the 18S, 5.8S, and 25S subunits (coordinates 2195–4002, 4271–4434, and 4623–8009, respectively) on previous reports (Gruendler et al. 1989; Unfried et al. 1989; Unfried and Gruendler 1990; Cokus et al. 2008). We retrieved per-base read depth with the function Depthofcoverage from GATK (Van der Auwera et al. 2013) before and after removal of duplicated reads. Since the correlation between NGS and qPCR estimates of 45S rRNA gene copy number has been shown to be better before removal of duplicated reads (Long et al. 2013), we performed further quantitative analysis with NGS estimates accordingly.
Since estimates of the 18S and 25S subunits of the 45S rRNA gene are in good agreement (Davison et al. 2007), we estimated 45S rRNA gene copy number in F2s, RILs, multiparent advanced-generation inter-cross (MAGIC) lines, and mutation accumulation (MA) lines through NGS by dividing the average coverage along the 18S rRNA gene by the average coverage along the first 10 Mb of chromosome 3 (File S2). We have chosen that region of chromosome 3 for not containing centromeres, 5S or 45S rRNA genes that due to natural variation in their copy number among accessions (Davison et al. 2007; Long et al. 2013) could affect our sequencing depth estimates.
In the RIL population (Cvi-0 × Ler-0), Cvi-0 was the donor female of RILs CVL1-CVL147, while Ler-0 was the donor female of RILs CVL148-CVL193 (Alonso-Blanco et al. 1998; Lewis et al. 2004). The direction of the cross had no significant effect on rRNA gene copy number (results not shown). Since the 134 individuals of the RIL population were sequenced in separate Illumina lanes, we fitted a simple linear regression model on 18 technical replicates to account for the plate effect and obtain a single rRNA gene copy number estimate per line.
Estimating rRNA gene copy number through qPCR
We estimated 45S rRNA gene copy number in the MA lines through quantitative PCR (qPCR) by comparing the abundance of the 18S rRNA subunit with the single copy gene At3g18780 (ACT2) according to:
where Ct() stands for the threshold cycle for .
For the 18S rRNA gene we used primers 5′-CCT GCG GCT TAA TTT GAC TC-3′ and 5′-GAC AAA TCG CTC CAC CAA CT-3′, and for ACT2, primers 5′-TGC CAA TCT ACG AGG GTT TC-3′ and 5′-TTA CAA TTT CCC GCT CTG CT-3′ (Davison et al. 2007). We employed the FastStart Essential DNA Green Master kit (Roche) according to the manufacturer’s instructions in a LightCycler 96 (Roche) with the following thermal profile: preincubation at 95° for 600 sec; 45 cycles at 95° for 10 sec, 60° for 15 sec (in acquisition mode), and 72° for 15 sec; melting step at 95° for 10 sec, 65° for 60 sec, and 97° for 1 sec. No primer dimers were detected in the melting curve.
Four to five biological replicates of MA lines 29, 39, 49, 59, 69, 79, 89, 99, 109, and 119 (Shaw et al. 2000; Ossowski et al. 2010) were propagated one generation by single-seed descent. We carried out qPCR of each line in four technical replicates (both for the 18S rRNA gene and ACT2) per plate. With the help of a Bravo Automated Liquid Handling Platform from Agilent Technologies, we distributed all lines in 14 96-well plates with some lines present in more than one plate. Ct Errors of technical replicates were on average 0.038 (range 0.01–0.14). We included a common DNA control (accession id: 1002) for all plates for the purpose of standardization. Raw 18S rRNA gene copy number estimates and standardized values are provided in File S2. For the purpose of visualization we plotted 18S rRNA gene abundance relative to the lowest line mean value in generation 32 (line 69).
Linkage mapping
Simple interval mapping (SIM) was performed with the R package R/qtl (Broman et al. 2003). Multiple QTL mapping (MQM) was done with a 2 cM step size and 10 as window size (Arends et al. 2010). A thousand permutations were applied to estimate genome-wide significance. QTL mapping in MAGIC lines and multiple imputation to determine estimated founder accession effects were performed with R/happy (Mott et al. 2000; Kover et al. 2009).
Cleaved amplified polymorphic sequence (CAPS) analysis
CAPS analysis of RILs derived from the cross Cvi-0 × Ler-0 was performed as described elsewhere (Lewis et al. 2004). Briefly, DNA from each RIL was amplified by PCR in a 30 µl reaction with primers 5′-AGG GGG GTG GGT GTT GAG GGA-3′ and 5′-ATC TCG GTA TTT CGT GCG CAA GAC G-3′, and the following thermal profile: 32 cycles at 95° for 20 sec, 62° for 20 sec, and 72° for 40 sec. Resulting PCR products were incubated with restriction enzyme RsaI (New England Biolabs Inc.) for 4 hr at 37° and subjected to agarose gel electrophoresis. Cleaved PCR products correspond to Cvi-0 derived rRNA genes, while intact PCR products correspond to Ler-0 derived rRNA genes. Results are summarized in File S2.
Fluorescence in situ hybridization (FISH)
The preparation of root-tip meristem chromosome spreads followed the protocol published by Mandáková and Lysak (2016). Seedlings were germinated on filter paper soaked in distilled water in a Petri dish at 21°. Cut, ∼1 cm long, roots were pretreated with ice-cold water for ∼24 hr, then fixed in ethanol:acetic acid (3:1) fixative at 4° for 24 hr. The fixed roots were rinsed in distilled water and 1× citrate buffer (10 mM sodium citrate, pH 4.8), and digested by 0.3% pectolytic enzymes (cellulase, cytohelicase, and pectolyase) in 1× citrate buffer at 37° for 90 min. Individual root-tip meristematic tissues were dissected in ∼20 µl of 60% acetic acid on a clean microscopic slide. Then the cell material was covered with a coverslip, evenly spread by tapping, and the slide gently heated over a flame. The slide was frozen in liquid nitrogen, the coverslip flicked off, fixed in ethanol:acetic acid (3:1) fixative, and air-dried. The suitable slides selected after inspection under a phase-contrast microscope were processed as described by Lysak and Mandáková (2013). In brief, the slides were pretreated with ribonuclease A (100 μg/ml in distilled water) at 37° for 1 hr and with pepsin (0.1 mg/ml in 10 mM HCl) at 37° for 1–3 min, and postfixed in 4% formaldehyde in 2× SSC (20× SSC: 3 M NaCl in 0.3 M sodium citrate, pH 7.0) at room temperature for 10 min. The slides were washed in 2× SSC between the steps and eventually dehydrated in an ethanol series (70, 80, and 96% ethanol, 3 min each).
A. thaliana BAC clone T15P10 containing 45S rRNA genes was used to identify the NORs. To identify A. thaliana chromosomes 2 and 4, 11 BAC clones from the upper arm of chromosome 2 (F2I9, T8O11, T23O15, F14H20, F5O4, T8K22, F3C11, F16J10, T3P4, T6P5, and T25N22) and 15 BACs from the upper arm of chromosome 4 (F6N15, F5I10, T18A10, F3D13, T15B16, T10M13, T14P8, T5J8, F4C21, F9H3, T27D20, T19B17, T26N6, T19J18, and T1J1) were used. The 45S rRNA gene probe was labeled with Cy3-dUTP, chromosome 2 BACs with biotin-dUTP, and chromosome 4 BAC clones with digoxigenin-dUTP by nick translation (Lysak and Mandáková 2013). Samples of 100 ng from each labeled BAC DNA were pooled together, ethanol precipitated, dissolved in 20 µl of 50% formamide in 10% dextran sulfate in 2× SSC, and pipetted onto the selected microscopic slides. The slides were heated to 80° for 2 min and incubated at 37° overnight. Hybridized DNA probes were visualized either as the direct fluorescence of Cy3-dUTP (yellow) or through fluorescently labeled antibodies against biotin-dUTP (red) and digoxigenin-dUTP (green). DNA labeling and fluorescence signal detection were carried out using a previously published protocol (Lysak and Mandáková 2013). Chromosomes and nuclei were counterstained with 4,6-diamidino-2-phenylindole (DAPI, 2 µg/ml) in Vectashield antifade. Fluorescence signals were analyzed and photographed using a Zeiss Axioimager epifluorescence microscope and a CoolCube camera (MetaSystems), and pseudocolored/inverted using Adobe Photoshop CS5 software (Adobe Systems). The size of fluorescence signals corresponding to the 45S rRNA gene probe was measured in Photoshop as the number of pixels per a defined area.
Data availability
DNA sequencing data from F2s and RIL populations have been deposited at the U.S. National Center for Biotechnology information (https://www.ncbi.nlm.nih.gov/bioproject) under BioProject: PRJNA326502. DNA sequencing data from the MAGIC lines and MA lines were downloaded from the European Nucleotide Archive (http://www.ebi.ac.uk/ena) under accession numbers PRJEB19252 (Imprialou et al. 2017) and PRJEB5287 (Hagmann et al. 2015), respectively.
Results
45S rRNA gene CNV can be mapped to specific NORs in F2s
To better understand the genetics of 45S rRNA gene CNV, we generated an F2 population from a cross between a large copy number accession from northern Sweden—TRÄ-01 (6244), with ∼2500 units per haploid genome—and a small copy number accession from southern Sweden – Ale-Stenar-64-24 (1002), with ∼500 units. We used NGS to phenotype (we estimated the copy number of the 18S rRNA gene, which is strongly correlated with the copy number of the full gene) and genotype the population simultaneously (Figure 1A; see Materials and Methods). In sharp contrast to GWAS, linkage mapping identified the distal end region at the top of chromosome 2 as the sole source of variation in rRNA gene copy number in this population (Figure 1B). The trans-association identified by GWAS in chromosome 1 (Long et al. 2013) was not captured by this analysis, despite the fact that the alleles responsible for the presumed association segregate in the parental accessions.
To corroborate that NOR2 is indeed responsible for the difference in rRNA gene copy number, we performed FISH in both parental accessions. The results showed that NOR2 and NOR4 in the southern accession Ale-Stenar-64-24 are of similar size to each other—NOR4 is on average 1.49× larger (106.5/71.17 pixels; n = 29) than NOR2 in mitotic chromosomes—while in the northern accession TRÄ-01, NOR2 is 2.39× larger than NOR4 (299.64/125.27 pixels; n = 26) (Figure 1C).
Mapping in two further F2 populations showed that it is not always NOR2 varying in size. CNV mapped to NOR2 in the cross Ull1-1 (8426) × TDr-7 (6193) (Figure S1, A and B in File S3), but to NOR4 in the cross T460 (6106) × Omn-5 (6071) (Figure S1, C and D in File S3). In neither population was there evidence of any trans-acting loci. Taken together, these results show that both NORs vary in size, and that this size is stable enough to be readily traced over two generations. Given this stability, it is not unexpected that we saw no evidence of trans-acting loci, because such loci would by necessity modify the copy number.
Size heterogeneity of rRNA gene loci in a worldwide population
Two of our F2 populations identified NOR2 as the major source of CNV; one identified NOR4. To improve our understanding of CNV in the general population, beyond a few biparental crosses, we employed the MAGIC population that is derived from intercrossing 19 worldwide accessions (Kover et al. 2009). Mapping of 18S rRNA gene copy number in 393 individuals of the MAGIC population revealed that both NORs contribute to the variation to a similar extent (Figure 2A), with the contribution varying greatly among founder lines (Figure 2B). For example, on average, MAGIC lines carrying NOR2 from accessions Bur-0 (7058) and Zu-0 (7417) have fewer copies than do lines that carry NOR4 from these lines instead, because—as confirmed by FISH—founder accessions Bur-0 (Figure 2C) and Zu-0 (Figure 2D) have larger NOR4 than NOR2. Remarkably, we were unable to detect any fluorescence corresponding to 45S rRNA genes in chromosome 2 of Bur-0, suggesting that NOR2 is almost absent in this line (Figure 2C).
Unstable inheritance of rRNA gene copy number in a RIL population
While rRNA gene copy number appeared relatively stable in F2 progeny (Figure 1 and Figure S1, A–D in File S3), we thought it might be possible to observe changes in RIL populations, which have typically undergone at least eight generations of inbreeding since the original cross. Mapping in a RIL population derived from a cross between Cvi-0 and Ler-0 (Alonso-Blanco et al. 1998)—two accessions that differ by as few as ∼100 rRNA gene copies (Riddle and Richards 2002) (Figure S1E in File S3)—showed that rRNA gene CNV maps to NOR2 (Figure S1F in File S3). However, after splitting the estimates of rRNA gene copy number by parental origin for each NOR, aberrant values became apparent (Figure 3A). Most notably, CVL45 carried ∼200 rRNA gene copies less than other individuals with Cvi-only NORs, while CVL168 and CVL102 have ∼150 and ∼250 fewer copies, respectively, than other individuals carrying Ler-only NORs (Figure 3A).
To rule out that these drastic changes were due to interchromosomal exchange (recombination) between homologous NORs of different parental origin, we performed CAPS analysis that discriminates between rRNA genes of the parental accessions Cvi-0 and Ler-0 (Lewis et al. 2004) (Figure 3B). This analysis revealed that the low copy number phenotypes of CVL102 and CVL168 cannot be the product of recombination with Cvi-0 NORs, since no traces of Cvi-like NORs were identified. Similarly, CVL45 contains exclusively Cvi-0 NORs (Figure 3B). Copy number must thus have mutated in these lines, perhaps via unequal crossing-over. Indeed, our observations are consistent with numerous studies suggesting that unequal crossing-over is the prevalent mechanism in the evolution and dynamics of rRNA genes (Eickbush and Eickbush 2007); with sister chromatid exchange being more frequent than exchange between homologs in budding yeast (Petes 1980; Szostak and Wu 1980), fruit flies (Williams et al. 1989; Schlötterer and Tautz 1994), and humans (Seperack et al. 1988). Worth noticing is that the distribution of rRNA gene copy number in this RIL population, which has undergone at least nine generations of inbreeding, shows an apparent lack of F1-like phenotypes (Figure S1E in File S3), further supporting the notion that NORs in homologous chromosomes do not readily recombine in A. thaliana (Copenhaver et al. 1995). This is in apparent contrast to humans, where presumably meiotic recombination accounts for the striking variability observed at single NORs in parent–child trios (Schmickel et al. 1985; Kuick et al. 1996; Stults et al. 2008).
Changes in rRNA gene copy number may be associated with changes in heterochromatin formation (Paredes and Maggert 2009). Relative to Ler-0, Cvi-0 has reduced chromatin compaction, and QTL mapping (using the same RIL population used here) pointed to PHYTOCHROME-B (PHYB) and HISTONE DEACETYLASE 6 (HDA6) as regulators of light-mediated chromatin compaction (Tessadori et al. 2009). Furthermore, the decreased levels of DNA and histone H3K9 methylation at the NORs resembled those seen in the hda6 mutant in the Col-0 background (Riddle and Richards 2002; Earley et al. 2006, 2010; Tessadori et al. 2009). Although our mapping did not identify significant trans-acting QTL for rRNA gene CNV in this RIL population (Figure S1F in File S3), we tested the effect of NOR-of-origin as a function of the allele (Ler-0 or Cvi-0) inherited at either PHYB or HDA6 directly (using a linear model). This analysis revealed no significant contribution of PHYB (Figure S2A in File S3), and only a marginally significant interaction for the role of HDA6 at NOR genotypes Ler-Cvi and Ler-Ler – p-value = 0.0298 and p-value = 0.0125, respectively (Figure S2B in File S3).
Unstable inheritance of rRNA gene copy number in MA lines
We next turned to MA lines: independent descendants of the reference accession Col-0 that have been maintained by single-seed descent for over 30 generations in the absence of selection (Shaw et al. 2000). Note that since these are inbred lines, changes in copy number due to recombination between copy number variants can definitely be ruled out. We quantified 18S rRNA gene copy number by qPCR for two consecutive generations in 10 lines that have diverged for 31 generations (Ossowski et al. 2010; Schmitz et al. 2011; Becker et al. 2011) (Figure 3C). We considered a full linear mixed-effects model in which “line” and “generation” were added as fixed effects, while “replicates” per line across generations were added as random effects. We used likelihood ratio tests to compare the full model and two reduced models: (1) omitting “line” – the effect of 30 generations since divergence – or (2) “generation” – the effect of one subsequent propagation by single-seed descent. While “line” significantly affected rRNA copy number (2 (1) = 298.19, p-value < 2.2e−16), “generation” had a negligible impact (2 (2) = 3.6, p-value = 0.057). In other words, the difference among independent MA lines accumulated in the 31 generations since divergence is much greater than the one manifested in only one generation – or the intrinsic error of our measurement. That these estimates are reliable is also evidenced by the good correlation between qPCR and NGS estimates for generation 31 (R-squared = 0.88, p-value = 6.105e−05) (Figure S3 in File S3) (Becker et al. 2011; Hagmann et al. 2015). There is thus clear evidence for instability of rRNA gene copy number over as few as 30 generations.
Discussion
This study was motivated by our observation that rRNA gene copy number, the major determinant of genome size variation in A. thaliana, behaved very strangely in GWAS (Long et al. 2013). Specifically, although the variation was likely to be due to CNV at the NORs, we were not able to map them in cis. Instead, we mapped what appeared to be a trans-acting locus, which prompted us to consider rRNA gene CNV as a phenotype rather than a genotype, at least in part (Long et al. 2013). To help make sense of these findings, we decided to study the pattern of inheritance using F2s and inbred lines. As opposed to the case in humans (Schmickel et al. 1985; Kuick et al. 1996; Stults et al. 2008), we found that rRNA gene copy number clearly behaves like a genetic trait in pedigrees, with the trait mapping either to NOR2 or NOR4 depending on the parents (Figure 1, Figure 2, and Figure S1 in File S3). However, we also found that the trait is unstably inherited: by amassing estimates of rRNA gene copy number from F2s, RILs, and MA lines in sets of individuals sharing the same genotypes at both NORs, we were able to show that progressive copy number changes are evident already in tens of generations (Figure 4). Together, these two observations provide an explanation for why we were not able to map the NORs using GWAS: copy number is simply too unstable, and hence not heritable over the time scales relevant in GWAS. This is thus a bona fide case of “missing heritability”—a trait that is heritable in families, but cannot possibly be mapped using GWAS (Manolio et al. 2009).
We did not find any evidence for trans-acting loci affecting rRNA gene copy number in any of the artificial mapping populations used in this study. Since we now know that the trait behaves like a genotype rather than a phenotype on this time scale, this is not surprising. It also does not imply that the reported association (Long et al. 2013) is a false positive, because a trans-acting locus that works by biasing the mutation process, predisposing carriers to acquire more or less copies, would not have any effect over a few generations. Such a locus may still affect genome size in local populations of A. thaliana, and be mappable using GWAS (this would thus be exactly the opposite of missing heritability—a phenotype that is only heritable on a population scale but cannot be observed in pedigrees). Resolving this through crosses may be difficult in a plant with a relatively long life cycle.
Does all this variation have any biological relevance? It has been recently shown that in A. thaliana Col-0, only NOR4 derived rRNA genes are actively transcribed and associated with the nucleolus, while NOR2 is silent (Pontvianne et al. 2010, 2013; Chandrasekhara et al. 2016). However, our cytological analysis showed that in TRÄ-01, NOR2 is the NOR associated with the nucleolus, indicating that in this accession, NOR2 rRNA genes might be the active ones (Figure 1C and Table S1 in File S3). Furthermore, using transcriptome analysis (F. A. Rabanal and M. Nordborg, unpublished data), we identified great variation among accessions in which NOR is utilized, and demonstrated that a complex dominance hierarchy appears to exist among NOR haplotypes. Thus, not only do both clusters contribute to genome size variation (Figure 1, Figure 2, and Figure S1 in File S3), but they also contribute to rRNA expression in natural populations. A large megabase-scale deletion at an rRNA gene cluster in the allotetraploid plant Tragopogon mirus led to a breakdown of nucleolar dominance patterns (Dobešová et al. 2015), and rRNA gene copy number shifts can even affect genome-wide euchromatic expression patterns in flies (Lemos et al. 2008; Paredes et al. 2011). Whether this is the case in A. thaliana remains unexplored.
In conclusion, we have shown that rRNA gene copy number is semi-conservatively inherited and starts to diverge over a time scale of tens of generations. As a result, the trait is heritable in pedigrees, but cannot be mapped using GWAS. This resolves the seemingly paradoxical GWAS results for rRNA gene CNV in A. thaliana, and lays the ground for trying to understand whether any of the observed variation has functional importance, as suggested by its geographic distribution (Long et al. 2013).
Supplementary Material
Supplemental material is available online at www.g3journal.org/lookup/suppl/doi:10.1534/g3.117.040204/-/DC1.
Acknowledgments
We thank Ashley Farlow, Daniele Filiault, and Ortrun Mittelsten Scheid for helpful discussions during the course of this project, Youssef Belkhadir for his valuable input in the preparation of this manuscript, and Claude Becker for providing seeds of the MA lines. This work was partly funded by a grant from the European Research Council (MAXMAP, grant No. 268962) and a grant from the Czech Science Foundation (grant No. P501/12/G090).
Author contributions: F.A.R. and M.N. conceived and designed the experiments. F.A.R., V.N., and T.M. performed the experiments. P.Y.N., M.A.L., and R.M. contributed reagents/materials/analysis tools. F.A.R. and M.N. wrote the manuscript.
Footnotes
Communicating editor: D. J. de Koning
Literature Cited
- 1001 Genomes Consortium , 2016. 1,135 genomes reveal the global pattern of polymorphism in Arabidopsis thaliana. Cell 166: 481–491. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alonso-Blanco C., Peeters A. J., Koornneef M., Lister C., Dean C., et al. , 1998. Development of an AFLP based linkage map of Ler, Col and Cvi Arabidopsis thaliana ecotypes and construction of a Ler/Cvi recombinant inbred line population. Plant J. 14: 259–271. [DOI] [PubMed] [Google Scholar]
- Arends D., Prins P., Jansen R. C., Broman K. W., 2010. R/qtl: high-throughput multiple QTL mapping. Bioinformatics 26: 2990–2992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Averbeck K. T., Eickbush T. H., 2005. Monitoring the mode and tempo of concerted evolution in the Drosophila melanogaster rDNA locus. Genetics 171: 1837–1846. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Becker C., Hagmann J., Müller J., Koenig D., Stegle O., et al. , 2011. Spontaneous epigenetic variation in the Arabidopsis thaliana methylome. Nature 480: 245–249. [DOI] [PubMed] [Google Scholar]
- Bik H. M., Fournier D., Sung W., Bergeron R. D., Thomas W. K., 2013. Intra-genomic variation in the ribosomal repeats of nematodes. PLoS One 8: e78230. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Broman K. W., Wu H., Sen S., Churchill G. A., 2003. R/qtl: QTL mapping in experimental crosses. Bioinformatics 19: 889–890. [DOI] [PubMed] [Google Scholar]
- Chambon P., 1975. Eukaryotic nuclear RNA polymerases. Annu. Rev. Biochem. 44: 613–638. [DOI] [PubMed] [Google Scholar]
- Chandrasekhara C., Mohannath G., Blevins T., Pontvianne F., Pikaard C. S., 2016. Chromosome-specific NOR inactivation explains selective rRNA gene silencing and dosage control in Arabidopsis. Genes Dev. 30: 177–190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cokus S. J., Feng S., Zhang X., Chen Z., Merriman B., et al. , 2008. Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature 452: 215–219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Copenhaver G. P., Pikaard C. S., 1996a RFLP and physical mapping with an rDNA-specific endonuclease reveals that nucleolus organizer regions of Arabidopsis thaliana adjoin the telomeres on chromosomes 2 and 4. Plant J. 9: 259–272. [DOI] [PubMed] [Google Scholar]
- Copenhaver G. P., Pikaard C. S., 1996b Two-dimensional RFLP analyses reveal megabase-sized clusters of rRNA gene variants in Arabidopsis thaliana, suggesting local spreading of variants as the mode for gene homogenization during concerted evolution. Plant J. 9: 273–282. [DOI] [PubMed] [Google Scholar]
- Copenhaver G. P., Doelling J. H., Gens S., Pikaard C. S., 1995. Use of RFLPs larger than 100 kbp to map the position and internal organization of the nucleolus organizer region on chromosome 2 in Arabidopsis thaliana. Plant J. 7: 273–286. [DOI] [PubMed] [Google Scholar]
- Davison J., Tyagi A., Comai L., 2007. Large-scale polymorphism of heterochromatic repeats in the DNA of Arabidopsis thaliana. BMC Plant Biol. 7: 44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DePristo M. A., Banks E., Poplin R., Garimella K. V., Maguire J. R., et al. , 2011. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43: 491–498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dobešová E., Malinská H., Matyášek R., Leitch A. R., Soltis D. E., et al. , 2015. Silenced rRNA genes are activated and substitute for partially eliminated active homeologs in the recently formed allotetraploid, Tragopogon mirus (Asteraceae). Heredity 114: 356–365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Earley K., Lawrence R. J., Pontes O., Reuther R., Enciso A. J., et al. , 2006. Erasure of histone acetylation by Arabidopsis HDA6 mediates large-scale gene silencing in nucleolar dominance. Genes Dev. 20: 1283–1293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Earley K. W., Pontvianne F., Wierzbicki A. T., Blevins T., Tucker S., et al. , 2010. Mechanisms of HDA6-mediated rRNA gene silencing: suppression of intergenic Pol II transcription and differential effects on maintenance vs. siRNA-directed cytosine methylation. Genes Dev. 24: 1119–1132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eickbush T. H., Eickbush D. G., 2007. Finely orchestrated movements: evolution of the ribosomal RNA genes. Genetics 175: 477–485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gibbons J. G., Branco A. T., Godinho S. A., Yu S., Lemos B., 2015. Concerted copy number variation balances ribosomal DNA dosage in human and mouse genomes. Proc. Natl. Acad. Sci. USA 112: 2485–2490. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gruendler P., Unfried I., Pointner R., Schweizer D., 1989. Nucleotide sequence of the 25S–18S ribosomal gene spacer from Arabidopsis thaliana. Nucleic Acids Res. 17: 6395–6396. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hagmann J., Becker C., Müller J., Stegle O., Meyer R. C., et al. , 2015. Century-scale methylome stability in a recently diverged Arabidopsis thaliana lineage. PLoS Genet. 11: e1004920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Imprialou, M., A. Kahles, J. B. Steffen, E. J. Osborne, X. Gan et al., 2017 Genomic rearrangements in Arabidopsis considered as quantitative traits. Genetics DOI: 10.1534/genetics.116.192823. [DOI] [PMC free article] [PubMed]
- Kover P. X., Valdar W., Trakalo J., Scarcelli N., Ehrenreich I. M., et al. , 2009. A multiparent advanced generation inter-cross to fine-map quantitative traits in Arabidopsis thaliana. PLoS Genet. 5: e1000551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kuick R., Asakawa J., Neel J. V., Kodaira M., Satoh C., et al. , 1996. Studies of the inheritance of human ribosomal DNA variants detected in two-dimensional separations of genomic restriction fragments. Genetics 144: 307–316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lemos B., Araripe L. O., Hartl D. L., 2008. Polymorphic Y chromosomes harbor cryptic variation with manifold functional consequences. Science 319: 91–93. [DOI] [PubMed] [Google Scholar]
- Lewis M. S., Cheverud J. M., Pikaard C. S., 2004. Evidence for nucleolus organizer regions as the units of regulation in nucleolar dominance in Arabidopsis thaliana interecotype hybrids. Genetics 167: 931–939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H., 2013 Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv Available at: https://arxiv.org/abs/1303.3997.
- Li H., Durbin R., 2009. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25: 1754–1760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., et al. 1000 Genome Project Data Processing Subgroup , 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25: 2078–2079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Long E. O., Dawid I. B., 1980. Repeated genes in eukaryotes. Annu. Rev. Biochem. 49: 727–764. [DOI] [PubMed] [Google Scholar]
- Long Q., Rabanal F. A., Meng D., Huber C. D., Farlow A., et al. , 2013. Massive genomic variation and strong selection in Arabidopsis thaliana lines from Sweden. Nat. Genet. 45: 884–890. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lysak M. A., Mandáková T., 2013. Analysis of plant meiotic chromosomes by chromosome painting. Methods Mol. Biol. 990: 13–24. [DOI] [PubMed] [Google Scholar]
- Mandáková T., Lysak M. A., 2016. Chromosome preparation for cytogenetic analyses in Arabidopsis. Curr. Protoc. Plant Biol. 1: 43–51. [DOI] [PubMed] [Google Scholar]
- Manolio T. A., Collins F. S., Cox N. J., Goldstein D. B., Hindorff L. A., et al. , 2009. Finding the missing heritability of complex diseases. Nature 461: 747–753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McTaggart S. J., Dudycha J. L., Omilian A., Crease T. J., 2007. Rates of recombination in the ribosomal DNA of apomictically propagated Daphnia obtusa lines. Genetics 175: 311–320. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mott R., Talbot C. J., Turri M. G., Collins A. C., Flint J., 2000. A method for fine mapping quantitative trait loci in outbred animal stocks. Proc. Natl. Acad. Sci. USA 97: 12649–12654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ossowski S., Schneeberger K., Lucas-Lledó J. I., Warthmann N., Clark R. M., et al. , 2010. The rate and molecular spectrum of spontaneous mutations in Arabidopsis thaliana. Science 327: 92–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Paredes S., Maggert K. A., 2009. Ribosomal DNA contributes to global chromatin regulation. Proc. Natl. Acad. Sci. USA 106: 17829–17834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Paredes S., Branco A. T., Hartl D. L., Maggert K. A., Lemos B., 2011. Ribosomal DNA deletions modulate genome-wide gene expression: “rDNA-sensitive” genes and natural variation. PLoS Genet. 7: e1001376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Petes T. D., 1980. Unequal meiotic recombination within tandem arrays of yeast ribosomal DNA genes. Cell 19: 765–774. [DOI] [PubMed] [Google Scholar]
- Phillips R. L., 1978. Molecular cytogenetics of the nucleolus organizer region, pp. 711–741 in Maize Breeding and Genetics, edited by Walden D. B. John Wiley & Sons, New York. [Google Scholar]
- Pontvianne F., Abou-Ellail M., Douet J., Comella P., Matia I., et al. , 2010. Nucleolin is required for DNA methylation state and the expression of rRNA gene variants in Arabidopsis thaliana. PLoS Genet. 6: e1001225. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pontvianne F., Blevins T., Chandrasekhara C., Mozgová I., Hassel C., et al. , 2013. Subnuclear partitioning of rRNA genes between the nucleolus and nucleoplasm reflects alternative epiallelic states. Genes Dev. 27: 1545–1550. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R Core Team , 2014. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. [Google Scholar]
- Riddle N. C., Richards E. J., 2002. The control of natural variation in cytosine methylation in Arabidopsis. Genetics 162: 355–363. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Riddle N. C., Richards E. J., 2005. Genetic variation in epigenetic inheritance of ribosomal RNA gene methylation in Arabidopsis. Plant J. 41: 524–532. [DOI] [PubMed] [Google Scholar]
- Rogers S. O., Bendich A. J., 1987. Heritability and variability in ribosomal RNA genes of Vicia faba. Genetics 117: 285–295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ryan, J. A., and Ulrich J. M., 2014 xts: extensible time series. R package version 0.9–7. Available at: http://CRAN.R-project.org/package=xts.
- Schlötterer C., Tautz D., 1994. Chromosomal homogeneity of Drosophila ribosomal DNA arrays suggests intrachromosomal exchanges drive concerted evolution. Curr. Biol. 4: 777–783. [DOI] [PubMed] [Google Scholar]
- Schmickel R. D., Gonzalez I. L., Erickson J. M., 1985. Nucleolus organizing genes on chromosome 21: recombination and nondisjunction. Ann. N. Y. Acad. Sci. 450: 121–131. [DOI] [PubMed] [Google Scholar]
- Schmitz R. J., Schultz M. D., Lewsey M. G., O’Malley R. C., Urich M. A., et al. , 2011. Transgenerational epigenetic instability is a source of novel methylation variants. Science 334: 369–373. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schmuths H., Meister A., Horres R., Bachmann K., 2004. Genome size variation among accessions of Arabidopsis thaliana. Ann. Bot. 93: 317–321. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seperack P., Slatkin M., Arnheim N., 1988. Linkage disequilibrium in human ribosomal genes: implications for multigene family evolution. Genetics 119: 943–949. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shaw R. G., Byers D. L., Darmo E., 2000. Spontaneous mutational effects on reproductive traits of Arabidopsis thaliana. Genetics 155: 369–378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stults D. M., Killen M. W., Pierce H. H., Pierce A. J., 2008. Genomic architecture and inheritance of human ribosomal RNA gene clusters. Genome Res. 18: 13–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Szostak J. W., Wu R., 1980. Unequal crossing over in the ribosomal DNA of Saccharomyces cerevisiae. Nature 284: 426–430. [DOI] [PubMed] [Google Scholar]
- Tessadori F., van Zanten M., Pavlova P., Clifton R., Pontvianne F., et al. , 2009. PHYTOCHROME B and HISTONE DEACETYLASE 6 control light-induced chromatin compaction in Arabidopsis thaliana. PLoS Genet. 5: e1000638. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Unfried I., Gruendler P., 1990. Nucleotide sequence of the 5.8S and 25S rRNA genes and of the internal transcribed spacers from Arabidopsis thaliana. Nucleic Acids Res. 18: 4011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Unfried I., Stocker U., Gruendler P., 1989. Nucleotide sequence of the 18S rRNA gene from Arabidopsis thaliana Co10. Nucleic Acids Res. 17: 7513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van der Auwera G. A., Carneiro M. O., Hartl C., Poplin R., Del Angel G., et al. , 2013. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinformatics 43: 11.10.1–11.10.33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walbot V., Cullis C. A., 1985. Rapid genomic change in higher plants. Annu. Rev. Plant Physiol. 36: 367–396. [Google Scholar]
- Williams S. M., Kennison J. A., Robbins L. G., Strobeck C., 1989. Reciprocal recombination and the evolution of the ribosomal gene family of Drosophila melanogaster. Genetics 122: 617–624. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
DNA sequencing data from F2s and RIL populations have been deposited at the U.S. National Center for Biotechnology information (https://www.ncbi.nlm.nih.gov/bioproject) under BioProject: PRJNA326502. DNA sequencing data from the MAGIC lines and MA lines were downloaded from the European Nucleotide Archive (http://www.ebi.ac.uk/ena) under accession numbers PRJEB19252 (Imprialou et al. 2017) and PRJEB5287 (Hagmann et al. 2015), respectively.