Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2014 Aug 5;79(6):981–992. doi: 10.1111/tpj.12600

The low-recombining pericentromeric region of barley restricts gene diversity and evolution but not gene expression

Katie Baker 1, Micha Bayer 2, Nicola Cook 1,, Steven Dreißig 3, Taniya Dhillon 1, Joanne Russell 2, Pete E Hedley 2, Jenny Morris 2, Luke Ramsay 2, Isabelle Colas 2, Robbie Waugh 1,2, Brian Steffenson 4, Iain Milne 2, Gordon Stephen 2, David Marshall 2, Andrew J Flavell 1,*
PMCID: PMC4309411  PMID: 24947331

Abstract

The low-recombining pericentromeric region of the barley genome contains roughly a quarter of the genes of the species, embedded in low-recombining DNA that is rich in repeats and repressive chromatin signatures. We have investigated the effects of pericentromeric region residency upon the expression, diversity and evolution of these genes. We observe no significant difference in average transcript level or developmental RNA specificity between the barley pericentromeric region and the rest of the genome. In contrast, all of the evolutionary parameters studied here show evidence of compromised gene evolution in this region. First, genes within the pericentromeric region of wild barley show reduced diversity and significantly weakened purifying selection compared with the rest of the genome. Second, gene duplicates (ohnolog pairs) derived from the cereal whole-genome duplication event ca. 60MYa have been completely eliminated from the barley pericentromeric region. Third, local gene duplication in the pericentromeric region is reduced by 29% relative to the rest of the genome. Thus, the pericentromeric region of barley is a permissive environment for gene expression but has restricted gene evolution in a sizeable fraction of barley's genes.

Keywords: barley, Hordeum vulgare, heterochromatin, genome evolution, pericentromeric

Introduction

Barley was domesticated ca. 8000 years ago from its wild progenitor Hordeum vulgare ssp. spontaneum (hereafter termed H. spontaneum). It is the fourth most important cereal worldwide, after maize, rice and wheat. Barley is an inbreeding diploid species and has become a model for genomic research in other Triticeae crops, including wheat and rye. The sequence of the barley gene space, together with a framework for the genome sequence (International Barley Genome Sequencing Consortium (IBGSC), 2012); comprises 26 159 high-confidence (HC) genes anchored to a 3479-point genetic map (Comadran et al., 2012). The comparative genomics of barley versus small genome grass ‘model’ species has been analysed in depth and the major segmental rearrangements that distinguish the different genomes are known (Salse et al., 2008; Thiel et al., 2009; IBGSC, 2012).

The ancestor of the cereal grasses underwent a whole-genome duplication (WGD) event around 50–70 MYa (Salse et al., 2008). Since then there have been multiple lineage-specific genomic rearrangements (Salse et al., 2008; Thiel et al., 2009) in the evolving cereal lineages. In addition, there has been extensive gene loss, which has been biased in favour of one or other of the progenitor diploid genomes (Schnable et al., 2012).

Many cereal genomes are inflated in size, mainly as a result of the proliferation of transposable element (TE) insertions, most of which are retrotransposons (Paterson et al., 2009; Schnable et al., 2009; IBGSC, 2012). These insertions are more common in the regions surrounding the centromeres, leading to inflation in the pericentromeric (PC) region, which comprises at least 48% of the barley physical genome, containing an estimated 14–22% of the total barley gene content (IBGSC, 2012). Thus, gene number is high for the PC region and gene density is low, with each gene typically surrounded by huge TE arrays. This situation is not confined to cereals; for example, roughly 57% of the soybean genome and ˜22% of its genes are PC (Schmutz et al., 2010).

Recombination is reduced in the vicinity of TEs (Fu et al., 2002) and in PC regions it is strongly suppressed (Schnable et al., 2009; Schmutz et al., 2010; IBGSC, 2012; Higgins et al., 2012). In barley, recombination commences at telomeres and progresses internally with crossover interference inhibiting this process in the interior (Higgins et al., 2012). Lack of recombination in the PC regions renders the genes within it relatively inaccessible to breeders, who need to re-assort alleles to achieve crop improvement. Restricted recombination also has potential impact upon gene evolution (Begun and Aquadro, 1992; Hudson, 1994; Nordborg et al., 2005; Wright et al., 2006; Charlesworth et al., 2009). Multi-gene haplotypes in low-recombining (LR) regions are expected to evolve as concerted units with low diversity, which should be further reduced by the preferred selfing habit of most cereal crop species, including barley. Newly arising mutations in genes within LR regions, most of which would be either neutral or weakly deleterious, persist for many generations in close genetic linkage because recombination cannot separate them and selection cannot remove them. This phenomenon is known as Hill-Robertson interference.

The LR-PC region is predominantly heterochromatic, being highly compacted throughout the cell cycle. For sorghum, chromatin compaction and recombination rate are linked (Kim et al., 2005). In Arabidopsis DNA methylation and repressive histone covalent modifications such as H3K9Me2 (Lippman et al., 2004; Hall et al., 2012) correspond closely. For cereals there tends to be higher levels of repressive epigenetic marks in the PC regions but also clear evidence for the presence of such marks across the genome (Houben et al., 2003; Shi and Dawe, 2006; Carchilan et al., 2007; Higgins et al., 2012), consistent with the corresponding genomic distribution of retrotransposon insertions (Schnable et al., 2009; Paterson et al., 2009; IBGSC, 2012). Collectively, these data are consistent with the model that heterochromatin at the local level is defined by TE density (Lippman et al., 2004) and for cereals may potentially be found almost anywhere in the genome.

In animals, the juxtapositioning of heterochromatin near genes can lead to suppressed gene expression (Jost et al., 2012) but, in A. thaliana at least, genes surrounded by heterochromatin are insulated from heterochromatic silencing (Lippman et al., 2004). Total mRNA levels have been reported to be lower in the LR-PC region than for the predominantly high-recombining (HR) chromosome arms in soybean (Du et al., 2012) and maize (Gent et al., 2012). For rice, apparently contradictory results have been seen between chromosomes for averaged mRNA levels for the LR-PC versus HR regions (Yin et al., 2008; Wu et al., 2011). This issue is further complicated by the problem of gene annotation in the LR region, which contains decayed TE remnants that can be miss-annotated as ‘normal’ genes and thus inflate the apparent genic diversity of the PC region and distort other parameters such as gene expression data. In summary, the available evidence suggests that total mRNA levels are low across plant PC regions but this may be mainly due to low gene density, with averaged expression levels per gene not dramatically different from the rest of the genome.

LR-PC region residency also affects gain and loss of genes. Levels of gene tandem duplication in rice and Arabidopsis are correlated with recombination rate (Rizzon et al., 2006), as the former relies upon unequal exchange. Thus, multi-copy gene clusters would be expected to be smaller and/or less frequent in Triticeae LR DNA. Furthermore, genes that have become duplicated following segmental duplication or WGD events tend to be eliminated relatively rapidly in a process termed diploidization (Wolfe, 2001). Loss of WGD-derived gene paralogs (ohnologs [Wolfe, 2000]) has tended to occur asymmetrically among duplicated genome segments (Thomas et al., 2006; Woodhouse et al., 2010; Du et al., 2012), with HR–HR WGD gene duplicates having evolved more rapidly (Du et al., 2012). Retained gene pairs seem to diverge rapidly soon after duplication (Lynch and Conery, 2000) but more slowly than single genes overall (Yang et al., 2003; Yang and Gaut, 2011; Du et al., 2012).

In the current study, we have explored the diversity, evolution, expression and duplication of the genes in the LR-PC region of barley. We have compared these parameters to the rest of the genes of the species to discover how this genomic environment impacts upon them.

Results

Defining the low-recombining pericentromeric region of barley

The LR-PC region of barley can be visualised by plotting genetic map positions of genes or markers against their corresponding physical positions (Figures 1, S1 and Table S1). We define the LR-PC region as the continuous region surrounding the centromere for which recombination rate is 20-fold lower than the average for the barley genome (Choo, 1998). By this definition 6285 of 35 134 mapped barley genes (17.9%) are within the LR-PC region. If substantial LR regions flanking the PC region are included (Figure 1), a further 2400 genes (6.8%) are assigned to LR DNA, adding up to 24.7% of the total barley gene complement.

Figure 1.

Figure 1

The LR-PC region of barley – definition, diversity and recombination.

(a) Genetic versus physical map locations of genes on barley chromosome 2H. The centromere (dark grey oval) is surrounded by a continuous LR-PCH region (mid-grey bar), with flanking LR regions shown in light grey.

(b) Diversity and recombination statistics for chromosome 2H. Rolling averages for gene nucleotide diversity (π, red) are plotted with recombination rate (cM/gene, green) against gene order. The LR-PCH regions (grey shading) correspond to the regions in (a).

Gene-based diversity in the LR-PC region of wild barley

To investigate gene sequence diversity across the genome of the wild species H. spontaneum, we selected 14 diverse wild barley samples (Figure S2; Experimental procedures). RNA-seq was performed on whole seedlings and sequence reads were mapped onto a consensus set of 22 651 full-length (FL) barley cDNAs (Matsumoto et al., 2011). Unsurprisingly, read-depth varied both within and between the FLcDNAs, since mRNAs are expressed at different levels with multiple splice variants and cDNA synthesis efficiency varies along its template. Fortunately, read depths for most positions of most genes were remarkably similar for different samples, giving good overlap for SNP discovery. Our mapping contained a total of 128 749 SNPs.

To estimate regional changes in gene diversity within the barley genome, nucleotide diversity (π) statistics for the mapped genes corresponding to the FLcDNAs were plotted against corresponding map positions (Figures 1b and S3). There is a marked drop in π in the interior of all seven chromosomes. When recombination rate is plotted on the same graphs it is evident that gene nucleotide diversity for H. spontaneum broadly follows recombination rate (R), with a pronounced drop in the vicinity of the LR-PC regions.

mRNA expression levels in the LR-PC region

The LR-PC region has been considered a repressive environment and gene expression has reported to be reduced in the LR-PC region of maize and soybean (Du et al., 2012; Gent et al., 2012). We therefore explored mRNA expression across the barley genome. Steady-state mRNA levels in 15 tissue types, covering a variety of developmental stages of cv. Morex (Druka et al., 2006), were plotted against corresponding pseudo-physical map positions (IBGSC, 2012). Averaged RNA values across all tissue types are shown for chromosome 1H as an example in Figure 2(a) and detailed expression data are in Figure S4. No difference in RNA level was observed between the genes of the LR-PC region and those from the HR region, either by chromosome or by tissue type (anova, > 0.05).

Figure 2.

Figure 2

Gene expression level and developmental specificity are independent of LR-PC region residency in barley.

Expression parameters for barley genes (Y axes) are plotted against their linear order (X axes) on barley chromosome 1H. The continuous LR-PCH is shaded grey.

(a) Average RNA levels (arbitrary units), taken across 15 tissue types and developmental stages (Druka et al., 2006).

(b) Developmental and/or tissue specificity quotients (= data from (a) divided by their corresponding standard deviations).

We also searched for possible differences in RNA expression specificity between the LR-PC and HR regions. Fluctuations between RNA levels for each gene among the 15 tissue and developmental stage types, relative to their corresponding average value (i.e. [average expression/standard deviation]; Figure 2b) were plotted. Again, no trend was visible, none was supported by anova (> 0.05) and we conclude that the genes within the LR-PC regions show no significant evidence for differential expression specificity, relative to the rest of the genome.

Gene evolution in the LR-PC region

Restricted recombination within the LR-PC region is expected to impact upon its gene evolution (see Introduction), leading to increased non-synonymous substitution (πa), relative to the synonymous rate (πs) (Charlesworth et al., 2009). To test this prediction, gene-based πa, πs and πas values were explored in wild barley (H. spontaneum). π statistics were determined for 5475 mapped genes in the 14 wild barley sample RNA-seq data set described above (Figures 3 and S5). Both πa and πs show marked diversity reduction in the LR-PC region and this effect is less pronounced for πa, consistent with the above predictions. πas plots show strong fluctuations, making it difficult to discern LR-PC effects for some chromosomes but the mean πas value for all LR-PC region genes is 0.235 (SD 0.010) and the same parameter for HR genes is 0.170 (SD 0.011; Table S2). This difference is significant (independent t-test; t = −11.507, df = 12, P < 0.001) and we conclude that the evolution of genes within the LR-PC region of H. spontaneum has been significantly impacted by Hill–Robertson interference.

Figure 3.

Figure 3

Gene selection is less effective in the LR-PC region than the HR genome compartment of H. spontaneum.

πa, πs and πas values per gene among 14 diverse H. spontaneum lines (Y axes) are plotted against their corresponding linear gene orders (X axes) on barley chromosome 2H. Black lines indicate rolling averages (50 genes) and the LR-PC regions are indicated by shading.

The LR-PC region and the cereal whole-genome duplication

To explore the effect of restricted recombination upon gene and genome evolution over the timescale of the evolution of cereal grasses, we searched for barley ohnolog gene pairs deriving from the cereal WGD that occurred ca. 60 MYa. Each gene in such pairs resides either in the LR-PC or the HR regions and the long-term effects of the two genome compartments upon gene evolution can be compared (see Discussion).

It is difficult to isolate WGD-derived barley ohnolog pairs because the Triticeae lineage has experienced high levels of translocation of genes and pseudogenes (Wicker et al., 2011) and local chromosome rearrangements, particularly segmental inversions (Luo et al., 2009) (Figure S6). We therefore adopted an iterative strategy designed to collect gene pairs showing sequence similarity within the broad range expected for WGD-derived duplication and genomic locations inherited from the WGD (i.e. no gene transposition). First, all barley genes with assigned genomic positions were compared against each other by blast search. Second, the output was filtered to remove gene pairs much too similar (i.e. recently duplicated) or diverged (>60 MYa separation) to be WGD-derived, Third, all genes in non-syntenic genomic positions relative to the rice, Brachypodium and sorghum genomes were removed, reasoning that a barley gene with no synteny support from these other cereal genomes is extremely unlikely to show synteny conservation from the WGD. Fourth, remaining barley genes were reordered according to local Brachypodium gene order, i.e. replacing barley genetic map order by Brachypodium physical order within collinear barley-Brachypodium synteny blocks (Figure S7). Finally, residual non-collinear barley genes with anomalous barley genetic map positions off the main orthology trend were removed (Figure S8).

This procedure yielded a final gene list of 12 348 mapped barley–Brachypodium syntenic gene pairs occupying substantially orthologous genomic locations between the two complete genomes. A chromosome versus chromosome X–Y plot of best BLAST hits with this cleaned-up set of barley genes revealed seven major WGD segments reported previously (Salse et al., 2008; Thiel et al., 2009; Figure S9) but the loose structure made it difficult to discriminate genuine ohnolog pairs from chance juxtaposition of transposed paralogs. We therefore used two approaches in parallel to eliminate the latter and our final selection represents a synthesis of both. First, the X–Y plots were manually edited to remove all pairs falling outside dense groupings, yielding 408 ohnolog pairs (Figure S10a). Second, the unedited paralog gene list was inputted into MCScanX (Wang et al., 2012), which finds ohnologous regions in genomes with ancient WGDs, yielding 366 pairs (Figure S10b). Merging these two lists yielded a consolidated candidate set of 498 ohnolog pairs (overlap = 276 gene pairs). Final manual trimming of this set removed 101 pairs with Ks scores >3, 107 pairs with genes involved in >1 pair, pairs in ohnologous regions comprising <9 genes and pairs in regions overlapping stronger regions of paralogy.

Our final list comprises 290 HC WGD ohnolog pairs (580 genes) (Table S3, Figure S10c). 281 of these pairs are distributed among the seven previously defined WGD descendant chromosome segments (Salse et al., 2008; Thiel et al., 2009) and the remaining nine derive from the diploid cereal ancestral A3/A7 chromosome duplication (= barley 2H/5H).

Properties of WGD-derived ohnolog pairs

If the LR-PC region has affected the properties of genes contained within it then there should be corresponding differences between the three possible classes of ohnolog pairs, namely LR–LR, LR–HR and HR–HR. The properties of the WGD ohnolog pairs retained by barley are summarized in Table 1.

Table 1.

Distribution of barley ohnologs and ohnolog pairs by genome compartment and chromosome

Type Genome compartment Number
Obsa Expa Chromosomal distribution
Total
1H 2H 3H 4H 5H 6H 7H
Ohnolog HR 477 437 124 70 97 13 28 75 70 477
LR 103 143 1 8 17 8 0 52 17 103
HR–HR 187 164 107 38 97 12 3 75 42 374
Ohnolog pair LR–HR LR 103 107 1 8 17 8 0 52 17 103
HR 103 107 17 32 0 1 25 0 28 103
LR–LR 0 18 0 0 0 0 0 0 0 0
a

Exp, Expected; Obs, Observed (see text).

Of the 580 ohnologs, 103 (18%) reside in the LR-PC region, together with flanking LR regions, compared to an overall gene content for this compartment of 25%. Thus, ohnologs have been preferentially lost from the LR-PC region (χ2 = 16.2, df = 1, P = 0.00006). When the distribution of the 290 ohnolog pairs among the three combinations of genome compartment (LR–LR, LR–HR, HR–HR) is examined, the main source of the loss becomes apparent. One hundred and eighty-seven ohnolog pairs are HR–HR (expected 164), 103 pairs are LR–HR (expected 107) and surprisingly, none are LR–LR (expected 18). This biased distribution is also highly significant (χ2 = 21.9, df = 2, P < 0.0001) and the main source of this is the LR–LR category. We conclude from these results that LR–LR WGD-gene pairs have been strongly selected against in the barley lineage and HR–HR and LR–HR classes show no significant evidence for this.

The ohnologs are distributed unevenly across the genome (Table 1 and Figure 4). Chromosomes 4 and 5 carry only 10% of the total ohnologs within 27% of the total mapped gene content of barley, half of the entire LR-PC region ohnolog complement derives from chromosome 6H alone and a further 33% is found on chromosomes 3H and 7H together, leaving just 17% between the other four chromosomes. These biases derive mainly from the fact that the large majority of ohnolog pairs belongs to four WGD-derived regions shared by chromosomes 1H and 3H, 2H and 6H and 6H and 7H (Figure 3). Almost all of chromosomes 6H and 7H, together with the long arms of chromosomes 1H, 2H and 3H retain ohnologs. All the ohnologous regions combined correspond to genomic space containing 46% of mapped barley genes (16 013/35 134), or 51% of the barley physical map (1.98 Gbp of the total 3.90 Gbp). Thus, roughly half of the barley genome lacks detectable duplicated gene pairs derived from the WGD.

Figure 4.

Figure 4

Barley ohnolog pairs and the LR-PC region.

Barley chromosomes 1H–7H are indicated by different coloured bars, LR-PCH regions are indicated by dark shading, flanking LR by light shading and centromeres by dark grey ovals. All features are scaled by pseudo-physical map position (IBGSC, 2012). Ohnolog gene pairs are connected by colour-coded lines to indicate HR–HR pairs (red) and LR–HR pairs (blue). There are no LR–LR pairs.

We also scrutinized the sequence evolution of the 290 ohnolog pairs. Average pairwise nucleotide identity between them is 69.8% and the average Ks (synonymous substitution rate) is 1.290, consistent with a non-coding nucleotide substitution rate of 1.38 × 10−9 substitutions/site/year, assuming a divergence time of 60 MYa. Ka, Ks and Ka/Ks values are all slightly higher between HR pairs (0.18, 1.30, 0.17) as compared with LR–HR pairs (0.17, 1.27, 0.15) but the differences are not significant. The frequency distributions for the three evolutionary parameters (Figure S11) show slight skewing of HR–HR pairs towards higher Ka/Ks values. This corresponds to 25 ohnolog pairs with Ka/Ks > 0.3, 21 of which are HR.

We next explored the functional evolution of the barley ohnolog pairs. GO term enrichment analysis of the 290 pairs (Du et al., 2010) showed significant over-representation in several biochemical pathways or functions, particularly intracellular signaling and phosphate modification (Table S4). Thus, there is evidence that barley ohnolog pairs have been preserved because of the functions of their gene products.

Finally, we investigated the gene expression of the ohnolog pairs, using RNA-seq data from the IBGSC dataset (IBGSC, 2012). Analysis of covariance testing revealed no significant difference in differential gene expression between HR–HR and LR–HR pairs (68 and 71% respectively). Similarly, K-means cluster analysis of the same data set showed no evidence for clustering by ohnolog pair type. We conclude that retained ohnolog pairs show no significant effect of the LR-PC region upon their expression divergence. We also looked for bias in ohnolog expression by chromosomal region. Of the 290 ohnolog pairs, 180 show significant bias in averaged RNA level within gene pairs. When these are assigned to ohnolog region (Table S5), no evidence of regional bias in ohnolog expression is evident (χ2 = 2.85; P = 0.83).

The LR-PC region and local gene duplication

Local gene duplication is the result of unequal exchange during recombination, so reduced recombination should lead to reduction in locally duplicated gene families (Zhang and Gaut, 2003; Rizzon et al., 2006). Local duplications appear as points on the diagonal in a second-best blast plot of a chromosome to itself (e.g. Figure S6). When our mapped gene dataset was analysed 7379 locally duplicated genes were identified, or 21% of the total mapped gene complement. 1297 (17.6%) of these locally duplicated genes reside in the PC region, compared to 24.7% of all genes (Table 2). Therefore the barley PC region is depleted by roughly 29% for locally duplicated genes, relative to the non-PC region. Both this difference and all corresponding differences for the seven barley chromosomes are significant (Table 2).

Table 2.

Distribution of locally duplicated genes by genome compartment and chromosome

Chromosome Locally duplicated genes
All mapped genes
Ratio of percentagea P-valueb
HR LR Total % LR HR LR Total % LR
1H 702 150 852 17.6 3405 895 4300 20.8 0.85 0.0374
2H 937 200 1137 17.6 4183 1398 5581 25.0 0.70 <0.0001
3H 1055 229 1284 17.8 4172 1384 5556 24.9 0.72 <0.0001
4H 480 169 649 26.0 2421 1226 3647 33.6 0.77 0.0002
5H 1120 168 1288 13.0 4711 1148 5859 19.6 0.67 <0.0001
6H 676 217 893 24.3 2820 1487 4307 34.5 0.70 <0.0001
7H 1112 164 1276 12.9 4733 1150 5883 19.5 0.66 <0.0001
Total 6082 1297 7379 17.6 26445 8688 35133 24.7 0.71 <0.0001
% 82.4 17.6 100 75.3 24.7 100
a

% LR (locally duplicated genes)/% LR (all mapped genes).

b

Two-tailed P-value for chi-squared test with Yate's correction.

To investigate the distribution of local gene duplications in the barley genome, we plotted the densities of local gene duplications across all chromosomes (Figure S12). No strong trend was seen, but there is a slight tendency for increased local gene duplication density towards the telomeres. Analysis of gene ontology of local duplications using AgriGO v1.2 (Du et al., 2010) revealed no specific GO terms to be enriched in the local duplication dataset compared to all genes.

Discussion

The LR-PC region of barley is permissive for gene expression

The LR-PC region of barley contains roughly a quarter of the genes for the species and at least 48% of the sequenced barley genome in an environment where genetic recombination is suppressed at least 20-fold relative to the average rate and chromatin remains largely compacted during interphase. The exact DNA sequence of this region remains somewhat unclear at present, because it is extremely difficult to sequence through the high density of nested repeats therein. Nevertheless, it is highly likely that the great majority of genes in the LR-PC region are embedded within extensive TE clusters, which are known to be functionally repressed in plants via the feedback loop of RNA silencing and methylation of DNA and histones (Lippman et al., 2004; Hall et al., 2012). Despite these constraints, our results show that both average mRNA level and developmental transcriptional specificity for the genes of the LR-PC region are indistinguishable from those for the HR gene compartment. The LR-PC region of barley is therefore wholly permissive for gene expression, implying that the genes within it are as accessible to the transcription machinery as are corresponding HR genes.

This conclusion contrasts somewhat with the perceived situation for rice (Yin et al., 2008; Wu et al., 2011) and soybean (Du et al., 2012) but these data are also consistent with reasonably abundant gene expression within the LR-PC region of angiosperm plants. For Arabidopsis, genes within heterochromatin are expressed at comparable levels to their euchromatic counterparts and carry local chromatin signatures such as DNA methylation and histone methylation, which are characteristic of euchromatic genes (Lippman et al., 2004). The barley PC region is far more dense in TEs than Arabidopsis and the barley HR region also carries high TE densities, but this environment seems to have little or no effect upon gene expression, so we expect that local chromatin structure in the two genomic compartments will turn out to be similar.

Restricted recombination of the LR-PC region affects overall nucleotide diversity and selection in wild barley

Our data show that the low recombination rate within the PC region of barley has acted upon the genes within it to constrain gene diversity, in agreement with previous studies (IBGSC, 2012). This phenomenon is widespread in nature and has been ascribed to a combination of selective sweeps via fixation of advantageous allele variants and background selection against deleterious mutations (Hudson, 1994; Wright et al., 2006). It should be noted that selective sweeps are not confined to LR regions, it is just that their extent is larger there (Begun and Aquadro, 1992). This fits with our data that show clear trends in both recombination rate and diversity to increase towards the telomeres (Figure S3) of most barley chromosomes.

Recombination allows selection to act upon genes, instead of large genomic regions and the LR-PC region of barley shows a 20-fold restricted recombination rate relative to the average for the species. Our data show that this reduction is associated with higher πas ratios for LR-PC region genes over their HR counterparts, which is consistent with Hill–Robertson effects. This build-up in poorly selected, protein-altering polymorphism is a genetic burden for the species and begs the question how such a large fraction of the barley genome has become involved. We suggest that the highly diverse and successful retrotransposon population in this lineage has played a major role in the expansion of its LR-PC region. Barley and the other Triticeae species have much larger genomes than their relatives such as Brachypodium and rice, most of the extra DNA is retrotransposons and most of these reside in the LR-PC regions. TEs drive the formation of heterochromatin via the RNAi pathway and recombination is associated with open chromatin (Berchowitz et al., 2009).

The potential impact of genetic bottlenecking and inefficient purifying selection in 25% of the barley gene complement upon crop performance is difficult to assess but may be considerable. Furthermore, the many loci within the barley LR-PC region that are important for crop improvement are trapped in extended haplotypes which are extremely difficult to break down by genetic crossing to achieve crop improvement. One promising solution to these problems is provided by the LR-PC regions of wild barley, which have considerably more diversity, both genic and haplotype, than the cultivated gene pool and should be considered as potential sources of new diversity for crop improvement in barley and the other economically important Triticeae crops.

Ohnolog evolution and the LR-PC region

To explore possible long-term effects of the LR-PC region on gene and genome evolution we have collected ohnolog gene pairs derived from the cereal WGD. Our assignment of 290 ohnolog pairs is likely to be an underestimate for two reasons. First, we have used very stringent criteria for defining ohnolog pairs, because false positives might distort the deductions derived from the small number of surviving gene pairs in barley, whereas underestimation of the numbers would be unlikely to greatly affect the broad conclusions. Second, we have only scrutinized the 60% of the total HC barley gene set that is mapped to date. It is therefore likely that the ohnolog pair number will increase significantly, but we think it very unlikely that it will approach the number for rice, which has had 2246 WGD ohnolog pairs defined (Thiel et al., 2009). Even if the ohnolog pair number for barley doubles, it still only represents a few percent of the gene complement for the species, indicating that the rate of gene synteny loss has been particularly high in the barley lineage compared with rice. It may not be a coincidence that barley has experienced much higher levels of both segmental rearrangement and gene translocations than rice (Salse et al., 2008; Thiel et al., 2009; Wicker et al., 2011).

It is also clear that ohnolog pair loss has been strongly biased by genomic position, with two chromosomes (6H and 7H) containing ohnologs across more or less their entire extent and the rest showing large gaps in ohnolog-containing regions. We estimate that roughly half of the barley genome, comprising entire short arms of chromosomes 1H, 3H, 4H and 5H plus large regions of 2HS, 4HL and 5HL, retains no evidence for the WGD (Figure 4). It will be interesting to compare the gene content of such regions with homeologous regions in other cereals that retain ohnolog pairs, to discover how this happened. It is important to note that loss of ohnolog pairs does not necessarily mean gene loss. Gene movement has been widespread for both barley and wheat (Wicker et al., 2010, 2011).

Local gene duplication in the barley LR region

We see a significant reduction in local duplication of genes in the LR-PC region and flanking LR regions, relative to the rest of the genome. This was expected, since homologous recombination must occur for tandem duplications to arise and a similar effect has been seen for Arabidopsis and rice (Zhang and Gaut, 2003; Rizzon et al., 2006). The LR-PC region is far more extensive for barley than it is for rice and Arabidopsis, because of its larger genome and much greater complement of repetitious DNA. We have therefore been able to map local gene duplication more accurately but we can still only just see this effect (Figure S12). This is consistent with the modest overall reduction that we see in local gene duplication of 29% and it contrasts strongly with >95% reduction on recombination rate. Thus, greatly reduced recombination does not mean greatly reduced gene duplication for barley. We conclude that selection acts to buffer this presumptive dramatic difference in the incidence of local gene duplicates across the genome, preserving rare duplicates in the LR regions and/or eliminating disadvantageous duplicates in the HR genome compartment.

Effects of chromosomal environment on divergent gene expression

Following WGD events gene loss (genome fractionation) is rapid and biased by genomic region. For maize, gene expression is the most important factor for gene retention (Schnable et al., 2011). We find no evidence for genomically-biased fractionation in barley based on gene expression (Table S5). This apparent contradiction may be explained by the fact that the maize WGD event occurred 5–12 MYa and we are looking at ohnolog pairs that have survived ca. 60 MYa of selection; Thus any bias may have disappeared over this much longer time interval. Another possibility is that the relatively small number of surviving ohnolog pairs in barley represent rare exceptions to biased fractionation. Nevertheless, soybean underwent a WGD around the same time as maize but shows no significant difference between expression level of LR-PC region genes and their HR ohnologs (Gent et al., 2012), consistent with our observation.

Extinction of ohnolog pairs from the LR-PC region

The complete absence of LR–LR ohnolog pairs for barley was perhaps the biggest surprise from our studies. We therefore looked in published plant genome data for sorghum (Paterson et al., 2009), rice (Rizzon et al., 2006; Thiel et al., 2009), Oryza brachyantha (Chen et al., 2013) and maize (Schnable et al., 2009). All these species appear to share this property. We therefore performed complete ohnolog analysis on the sequenced genomes of Brachypodium and rice, using the same recombination-based criterion for LR regions and the available genetic linkage maps and genome data (Tian et al., 2009; Huo et al., 2011). These species also show no evidence for LR–LR pairs (Tables S6–S8), showing that LR–LR pairs are at least very rare and perhaps absent from cereal genomes.

To our knowledge the only sequenced plant genome reported to contain LR–LR ohnolog pairs is soybean and all of these pairs are located at LR–HR boundaries (Du et al., 2012). These regions may have become LR relatively recently and ohnolog elimination has not yet been completed or they may not be fully within the LR-PC region. We define the LR-PC region as a continuous region with at least 20-fold lower average recombination rate than the genomic average and none of the soybean regions fulfil that criterion, with reduced recombination ratios of between 4-fold and 13-fold (Du et al., 2012).

Why are LR–LR ohnolog pairs so rare? Barley's evolutionary lineage has experienced a high level of ohnolog pair loss and even a slight bias towards elimination of LR–LR ohnolog pairs could lead to their extinction. However, this argument is less persuasive for maize and soybean which retain high proportions of ohnolog pairs from more recent WGDs. We therefore suggest that LR–LR ohnolog pairs in plants are eliminated rapidly because neither copy can escape from the repressive environment for gene evolution in the LR-PC region, thus neofunctionalization is inhibited.

In conclusion, the barley LR-PC region is a permissive environment for the expression of genes but it restricts gene evolution and local duplication. It may be that the extinction of LR–LR WGD ohnolog pairs for barley and other plant genomes is a consequence of these restrictions. It is intriguing that these species thrive despite these restrictions on large fractions of their genes.

Experimental procedures

Plant materials

Fourteen Hordeum spontaneum germplasm samples from the World Barley Diversity Collection (WBDC; Steffenson et al., 2007) were selected to maximise both the diversity of chloroplast haplotypes and global genomic diversity, as judged by principal coordinate analysis (Figure S2) of SNP marker data using 1153 Illumina BOPA1 markers (Close et al., 2009).

Definition of LR regions

Genetic map positions for 35 134 mapped barley genes (26 159 HC and 8975 LC) in the Morex-Barke population (Mayer et al., 2011) were plotted against corresponding physical positions (Figure S1). LR regions were defined as continuous genomic regions longer than 2% of the corresponding physical chromosome length, with 20-fold lower recombination rate than the average for the corresponding chromosome (Choo, 1998). For Brachypodium and rice the same criterion for LR region was applied to linkage maps of Huo et al. (2011) and Tian et al. (2009), respectively.

Genomic transcription level analysis

Pseudo-physical map positions for genes on the Affymetrix Barley1 GeneChip were found by BLAST comparison of its array sequence file (http://www.plexdb.org/), with 79 379 HC and non-HC presumptive barley gene sequences, (IBGSC, 2012). Map positions and the corresponding transcriptomics data (Druka et al., 2006; Table S9) were plotted in MS Office Excel.

RNA-seq data acquisition and analysis

Barley seeds were germinated on moistened sterile filter paper in Petri plates at room temperature in the dark. Embryonic tissue (coleoptile, mesocotyl and seminal roots) was dissected and flash-frozen 4 days post-germination. Total RNA was extracted from 200 mg tissue using TriReagent (Sigma, http://www.sigmaaldrich.com/sigma-aldrich/home.html), with additional phenol–chloroform purification. RNAs were quality checked using a Bioanalyzer 2100 RNA 6000 Nano kit (Agilent, http://www.genomics.agilent.com). Sequencing was carried out on an Illumina GAII instrument (separate lanes per sample) with TruSeq RNA (Illumina, http://www.illumina.com) library generation and single-end 75-bp reads.

Raw sequence reads were quality trimmed from both ends using a base quality cut-off of 20. Identical duplicate reads were removed to reduce the false positive SNP discovery rates. Reads were mapped to a consensus set of 22 651 FLcDNAs, obtained by consolidating two studies (Sato et al., 2009; Matsumoto et al., 2011). Mapping used the Bowtie tool (Langmead et al., 2009), allowing one mismatch per read, to all possible mapping locations on the FLcDNA reference. Reads were mapped one sample at a time and resulting mappings were merged to produce a consolidated single mapping for all lines. To facilitate direct comparison of transcript abundances between and within samples, RPKM (reads/kilobase of reference/million reads) values were computed for all transcripts from the combined mappings of all lines.

Single nuclear polymorphism (SNP) discovery and validation

SNPs were discovered for each sample using custom-written code implemented as a prototype feature in Tablet software (Milne et al., 2013). The raw variant data were pre-filtered to remove variant locations caused by sequencing errors, using both a minor allele frequency cut-off of 0.1 and a minimum read count of 3. To validate mapping and SNP discovery, genotype calls from 10 samples with 1713 SNPs were compared against known corresponding SNP genotypes from Illumina SNP genotyping of the same lines. Validation rates averaged 98% across all SNPs and lines.

πas determinations for H. spontaneum genes

πas ratios were derived from SNP data by implementing a custom Java code for protein-coding sequence identification and SNP effect prediction (i.e. protein-coding or non-coding and synonymous or non-synonymous change) on each cDNA read set and its corresponding reference sequence. The code used the translation engines supplied with the BioJava application programing interface (Prlic et al., 2012) and is available upon request from the authors. Protein-coding regions were defined as the longest open reading frames (ORFs) downstream of an ATG codon in the cDNA reference. Sequences upstream and downstream of these regions were defined as 5′ and 3′ UTRs respectively. Next, 100 putative protein-coding sequences identified above were manually checked using the NCBI ORF Finder tool. Each was subjected to BLASTP analysis and homology across the full sequence to proteins in related species was taken as evidence that the ORF was correctly assigned.

Ohnolog pair acquisition

Barley gene sequences were queried against themselves, using blastn with an initial e-value cut-off of 1. Self-hits were discarded then multiple high scoring pairs were reduced to the single best pair per query gene. The output was trimmed to exclude highly similar gene pairs with bit scores above 8000 and/or 100% nucleotide identity over >200 bp, plus very weakly related pairs with both bit scores <300 and alignment length <500 bp (these parameters were selected after scrutiny of the rice ohnolog pair set). Editing and X–Y plotting of paralog gene pairs used MS Office Excel except where noted. Barley genes in regions of low orthologous Brachypodium gene density (>400 Brachypodium gene separation) were removed with custom Java code, with subsequent manual clean-up in MS Office Excel. Ohnolog pairs were selected by a combination of visual inspection of chromosome-by-chromosome X–Y plots in MS Office Excel (‘handpicking’) and analysis with MCScanX (Wang et al., 2012; see Appendix S1). Shared synteny blocks between barley and Brachypodium (Table S10), obtained by plotting best hits between the two species' genomes (Figure S7), were used to order barley genes by Brachypodium gene order. To calculate Ka and Ks scores, aligned sequence pairs were analysed by yn00 (Yang, 2007) in the PAML package. To eliminate ohnolog pairs with unacceptably low alignment quality, alignments were inspected at both protein and DNA levels using both Geneious v6.1.6 (Drummond et al., 2011) and UniPro UGENE v1.12 (Okonechnikov et al., 2012). Circos 0.64 (Krzywinski et al., 2009) was used to plot the physical locations of the ohnolog pairs. SPSS v21 (IBM Corp., 2012) was used to perform t-tests on the πa, πs and πas data.

The final set of HC ohnolog pairs were assigned corresponding genomic environments (LR or HR), depending upon genetic map location (Table S3). Analysis of gene expression of the ohnologs was performed with 262 pairs with RNA-seq data for eight different tissues (IBGSC, 2012). Ohnologs with AK designations were converted to their MLOC equivalents as RNA-seq data is only available for MLOCs. RNA-seq data given in FPKM values were analysed with a univariate analysis of variance test in SPSS v21. PAST software (Hammer et al., 2001) was also used to perform a K-means clustering analysis, followed by a chi-squared test in MS Office Excel.

Barley local gene duplicate acquisition

All barley genes were blasted against themselves. All second-best hits to genes on the same chromosome as the query were selected (i.e. ignoring gene hits to self). From this dataset of barley best hits, locally duplicated genes were selected by removing hits on the same chromosome that were remote from query genes by >2% of the corresponding chromosome length (these were designated intrachromosomal gene translocations).

Gene ontology analysis

Putative protein sequences were queried against the NBCI non-redundant protein sequence database, using blastp with default settings. Results were processed in Blast2GO (B2G4Pipe Version 2.5.0) (Conesa et al., 2005). Blast2GO takes blast results and assigns GOSlim terms to query sequences, based on GO terms of hit sequences. 96315 GOSlim terms were assigned to 22 465 barley genes. AgriGO version 1.2 (Du et al., 2010) was used to separately analyse Gene Ontology (GO) term enrichment for both the ohnolog data set and the tandem gene data set.

Accession numbers

RNA-seq data for the 14 wild barley samples in this article are deposited in the European Nucleotide Archive (http://www.ebi.ac.uk/ena/) under study accession PRJEB4947 and sample accession numbers ERS369216- ERS369229 for barley samples WBDC016, WBDC032, WBDC035, WBDC115, WBDC142, WBDC170, WBDC173, WBDC182, WBDC227, WBDC255, WBDC307, WBDC319, WBDC336 and WBDC344 respectively.

Acknowledgments

We thank Brian Charlesworth for advice on diversity and recombination and Linda Cardle for informatics help. This work was supported by grants BBSRC (ERA-PG) ExBarDiv BB/E024726/1 and BBSRC BB/I1022899/1 ‘The diversity and evolution of the gene component of the barley pericentromeric heterochromatin’. I.C. was supported by European Community Grant FP7 222883.

Conflict of interest

The authors declare no conflicts of interest.

Supporting Information

Additional Supporting Information may be found in the online version of this article.

Figure S1

The low recombining peri-centromeric (LR-PC) region of barley.

Figure S2. Selection of highly diverse H. spontaneum lines from the World Barley Diversity Collection by principal coordinate analysis of high throughput SNP marker data.

Figure S3. Diversity and recombination statistics for barley chromosomes.

Figure S4. Developmental and tissue-specific RNA expression levels are independent of LR-PC region residency.

Figure S5. πa and πs statistics among 14 diverse lines across the H. spontaneum genome.

Figure S6. Identification of WGD-derived paralogous regions in the barley genome by visualization of BLAST data.

Figure S7. Using synteny conservation between barley and Brachypodium to order the barley genome.

Figure S8. Acquisition of synteny-supported barley genes using Brachypodium conserved synteny – Chromosome B3H as an example.

Figure S9. Barley paralogy plots for genes sharing conserved synteny with Brachypodium.

Figure S10. Selection of high-confidence, WGD-derived barley paralog pairs.

Figure S11. Nucleotide substitution data for barley ohnologs.

Figure S12. Local gene duplication densities along barley chromosomes.

tpj0079-0981-sd1.docx (5.9MB, docx)
Table S1

Genetic map locations and mapped gene contents for LR regions of the barley genome.

Table S2. H. spontaneum gene diversity and selection statistics.

Table S3. Barley ohnolog gene pairs.

Table S4. Gene ontology terms enriched in barley ohnologs.

Table S5. Analysis of ohnolog gene expression bias by ohnolog region.

Table S6. Distributions of ohnologs and ohnolog pairs for Brachypodium and rice by genome compartment.

Table S10. Shared synteny blocks between Brachypodium and barley genomes.

tpj0079-0981-sd2.docx (120KB, docx)
Table S7

Brachypodium ohnolog MCScanX data.

tpj0079-0981-sd3.xlsx (386.6KB, xlsx)
Table S8

Rice ohnolog MCScanX data.

tpj0079-0981-sd4.xlsx (921.6KB, xlsx)
Table S9

RNA-seq data, with gene assignations and map positions.

tpj0079-0981-sd5.xlsx (10.6MB, xlsx)
Appendix S1

MCScanX analysis.

tpj0079-0981-sd6.docx (99.7KB, docx)
tpj0079-0981-sd7.docx (25.5KB, docx)

References

  1. Begun DJ, Aquadro CF. Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature. 1992;356:519–520. doi: 10.1038/356519a0. [DOI] [PubMed] [Google Scholar]
  2. Berchowitz LE, Hanlon SE, Lieb JD, Copehnaver GP. A positive but complex association between meiotic double-strand break hotspots and open chromatin in Saccharomyces cerevisiae. Genome Res. 2009;19:2245–2257. doi: 10.1101/gr.096297.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Carchilan M, Delgado M, Ribeiro T, Costa-Nunes P, Caperta A, Morais-Cecílio L, Jones RN, Viegas W, Houben A. Transcriptionally active heterochromatin in rye B chromosomes. Plant Cell. 2007;19:1738–1749. doi: 10.1105/tpc.106.046946. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Charlesworth B, Betancourt AJ, Kaiser VB, Gordo I. Genetic recombination and molecular evolution. Cold Spring Harbor Symp. Quant. Biol. 2009;74:1–10. doi: 10.1101/sqb.2009.74.015. [DOI] [PubMed] [Google Scholar]
  5. Chen J, Huang Q, Gao D, et al. Whole-genome sequencing of Oryza brachyantha reveals mechanisms underlying Oryza genome evolution. Nat. Commun. 2013;4:1595. doi: 10.1038/ncomms2596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Choo KHA. Why is the centromere so cold? Genome Res. 1998;8:81–82. doi: 10.1101/gr.8.2.81. [DOI] [PubMed] [Google Scholar]
  7. Close TJ, Bhat PR, Lonardi S, et al. Development and implementation of high-throughput SNP genotyping in barley. BMC Genomics. 2009;10:582. doi: 10.1186/1471-2164-10-582. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Comadran J, Kilian B, Russell J, et al. Natural variation in a homolog of Antirrhinum CENTRORADIALIS contributed to spring growth habit and environmental adaptation in cultivated barley. Nat. Genet. 2012;44:1388–1392. doi: 10.1038/ng.2447. [DOI] [PubMed] [Google Scholar]
  9. Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21:3674–3676. doi: 10.1093/bioinformatics/bti610. [DOI] [PubMed] [Google Scholar]
  10. Druka A, Muehlbauer G, Druka I, et al. An atlas of gene expression from seed to seed through barley development. Funct. Integr. Genomics. 2006;6:202–211. doi: 10.1007/s10142-006-0025-4. [DOI] [PubMed] [Google Scholar]
  11. Drummond AJ, Ashton B, Buxton S, et al. 2011. Geneious v6.1.6 created by Biomatters. Available from http://www.geneious.com/
  12. Du Z, Zhou X, Ling Y, Zhang Z, Su Z. AgriGO: a GO analysis toolkit for the agricultural community. Nucleic Acids Res. 2010;38:W64–W70. doi: 10.1093/nar/gkq310. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Du J, Zhixi T, Sui Y, Zhao M, Song Q, Cannon SB, Cregan P, Ma J. Pericentromeric effects shape the patterns of divergence, retention, and expression of duplicated genes in the paleopolyploid soybean. Plant Cell. 2012;24:21–32. doi: 10.1105/tpc.111.092759. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Fu H, Zheng Z, Dooner HK. Recombination rates between adjacent genic and retrotransposon regions in maize vary by 2 orders of magnitude. Proc. Natl Acad. Sci. USA. 2002;99:1082–1087. doi: 10.1073/pnas.022635499. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Gent JI, Dong Y, Jiang J, Dawe RK. Strong epigenetic similarity between maize centromeric and pericentromeric regions at the level of small RNAs, DNA methylation and H3 chromatin modifications. Nucleic Acids Res. 2012;40:1550–1560. doi: 10.1093/nar/gkr862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Hall LE, Mitchell SE, O'Neill RJ. Pericentric and centromeric transcription: a perfect balance required. Chromosome Res. 2012;20:535–546. doi: 10.1007/s10577-012-9297-9. [DOI] [PubMed] [Google Scholar]
  17. Hammer Ø, Harper DAT, Ryan PD. PAST: Paleontological statistics software package for education and data analysis. Palaeontologia Electronica. 2001;4:9. [Google Scholar]
  18. Higgins JD, Perry RM, Barakate A, Ramsay L, Waugh R, Halpin C, Armstrong SJ, Franklin FCH. Spatiotemporal asymmetry of the meiotic program underlies the predominantly distal distribution of meiotic crossovers in barley. Plant Cell. 2012;24:4096–4109. doi: 10.1105/tpc.112.102483. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Houben A, Demidov D, Gernand D, Meister A, Leach CR, Schubert I. Methylation of histone H3 in euchromatin of plant chromosomes depends on basic nuclear DNA content. Plant J. 2003;33:967–973. doi: 10.1046/j.1365-313x.2003.01681.x. [DOI] [PubMed] [Google Scholar]
  20. Hudson RR. How can the low levels of DNA sequence variation in regions of the Drosophila genome with low recombination be explained? Proc. Natl Acad. Sci. USA. 1994;91:6815–6818. doi: 10.1073/pnas.91.15.6815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Huo N, Garvin DF, You FM, McMahon S, Luo M-C, Gu YQ, Lazo GR, Vogel JP. Comparison of a high-density genetic linkage map to genome features in the model grass Brachypodium distachyon. Theor. Appl. Genet. 2011;123:455–464. doi: 10.1007/s00122-011-1598-4. [DOI] [PubMed] [Google Scholar]
  22. IBM Corp. IBM SPSS Statistics for Windows, Version 21.0. Armonk, NY: IBM Corp; 2012. [Google Scholar]
  23. International Barley Genome Sequencing Consortium (IBGSC) A physical, genetic, and functional sequence assembly of the barley genome. Nature. 2012;491:711–716. doi: 10.1038/nature11543. [DOI] [PubMed] [Google Scholar]
  24. Jost KL, Bertulat B, Cardoso MC. Heterochromatin and gene positioning: inside, outside, any side? Chromosoma. 2012;121:555–563. doi: 10.1007/s00412-012-0389-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kim JS, Islam-Faridi MN, Klein PE, Stelly DM, Price HJ, Klein RR, Mullet JE. Comprehensive molecular cytogenetic analysis of sorghum genome architecture: distribution of euchromatin, heterochromatin, genes and recombination in comparison to rice. Genetics. 2005;171:1963–1976. doi: 10.1534/genetics.105.048215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Krzywinski MI, Schein JE, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA. Circos: an information aesthetic for comparative genomics. Genome Res. 2009;19:1639–1645. doi: 10.1101/gr.092759.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Lippman Z, Gendrel AV, Black M, et al. Role of transposable elements in heterochromatin and epigenetic control. Nature. 2004;430:471–476. doi: 10.1038/nature02651. [DOI] [PubMed] [Google Scholar]
  29. Luo MC, Deal KR, Akhunov ED, et al. Genome comparisons reveal a dominant mechanism of chromosome number reduction in grasses and accelerated genome evolution in Triticeae. Proc. Natl Acad. Sci. USA. 2009;106:15780–15785. doi: 10.1073/pnas.0908195106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Lynch M, Conery JS. The evolutionary fate and consequences of duplicate genes. Science. 2000;290:1151–1155. doi: 10.1126/science.290.5494.1151. [DOI] [PubMed] [Google Scholar]
  31. Matsumoto T, Tanaka T, Sakai H, et al. Comprehensive sequence analysis of 24,783 barley full-length cDNAs derived from 12 clone libraries. Plant Physiol. 2011;156:20–28. doi: 10.1104/pp.110.171579. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Mayer KFX, Martis M, Hedley PE, et al. Unlocking the barley genome by chromosomal and comparative genomics. Plant Cell. 2011;23:1249–1263. doi: 10.1105/tpc.110.082537. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Milne I, Stephen G, Bayer M, Cock PJA, Pritchard L, Cardle L, Shaw PD, Marshall D. Using Tablet for visual exploration of second-generation sequencing data. Brief. Bioinform. 2013;142:193–202. doi: 10.1093/bib/bbs012. [DOI] [PubMed] [Google Scholar]
  34. Nordborg M, Hu TT, Ishino Y, et al. The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol. 2005;3:e196. doi: 10.1371/journal.pbio.0030196. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Okonechnikov K, Golosova O, Fursov M. Unipro UGENE: a unified bioinformatics toolkit. Bioinformatics. 2012;28:1166–1167. doi: 10.1093/bioinformatics/bts091. [DOI] [PubMed] [Google Scholar]
  36. Paterson AH, Bowers JE, Bruggman R, et al. The Sorghum bicolor genome and the diversification of the grasses. Nature. 2009;457:551–556. doi: 10.1038/nature07723. [DOI] [PubMed] [Google Scholar]
  37. Prlic A, Yates A, Bliven SE, et al. BioJava: an open- source framework for bioinformatics. Bioinformatics. 2012;28:2693–2695. doi: 10.1093/bioinformatics/bts494. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Rizzon C, Ponger L, Gaut BS. Striking similarities in the genomic distribution of tandemly arrayed genes in Arabidopsis and rice. PLoS Comput. Biol. 2006;2:e115. doi: 10.1371/journal.pcbi.0020115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Salse J, Bolot S, Throude M, Jouffe V, Piegu B, Quraishi UM, Calcagno T, Cooke R, Delseny M, Feuillet C. Identification and characterization of shared duplications between rice and wheat provide new insight into grass genome evolution. Plant Cell. 2008;20:11–24. doi: 10.1105/tpc.107.056309. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Sato K, Shin IT, Shinozaki K, Takeda Y, Yamazaki Y, Conte M, Kohara Y. Development of 5006 full-length cDNAs in barley: a tool for accessing cereal genomics resources. DNA Res. 2009;16:81–89. doi: 10.1093/dnares/dsn034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Schmutz J, Cannon SB, Schlueter J, et al. Genome sequence of the paleopolyploid soybean. Nature. 2010;463:178–183. doi: 10.1038/nature08670. [DOI] [PubMed] [Google Scholar]
  42. Schnable PS, Ware D, Fulton RS, et al. The B73 Maize genome: complexity, diversity and dynamics. Science. 2009;326:1112–1115. doi: 10.1126/science.1178534. [DOI] [PubMed] [Google Scholar]
  43. Schnable JC, Springer NM, Freeling M. Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss. Proc. Natl Acad. Sci. USA. 2011;108:4069–4074. doi: 10.1073/pnas.1101368108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Schnable JC, Freeling M, Lyons E. Genome-wide analysis of syntenic gene deletion in the grasses. Genome Biol. Evol. 2012;4:265–277. doi: 10.1093/gbe/evs009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Shi J, Dawe RK. Partitioning of the maize epigenome by the number of methyl groups on histone H3 lysines 9 and 27. Genetics. 2006;173:1571–1583. doi: 10.1534/genetics.106.056853. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Steffenson BJ, Olivera P, Roy JK, Yue A, Jin B, Smith KP, Muehlbauer GJ. A walk on the wild side: mining wild wheat and barley collections for rust resistance genes. Aust. J. Agric. Res. 2007;58:532–544. [Google Scholar]
  47. Thiel T, Graner A, Waugh R, Grosse I, Close TJ, Stein N. Evidence and evolutionary analysis of ancient whole-genome duplication in barley predating the divergence from rice. BMC Evol. Biol. 2009;9:209. doi: 10.1186/1471-2148-9-209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Thomas BC, Pederson BS, Freeling M. Following tetraploidy in an Arabidopsis ancestor, genes were removed preferentially from one homeolog leaving clusters enriched in dose-sensitive genes. Genome Res. 2006;16:934–946. doi: 10.1101/gr.4708406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Tian Z, Rizzon C, Du J, Zhu L, Bennetzen JL, Jackson SA, Gaut BS, Ma J. Do genetic recombination and gene density shape the pattern of DNA elimination in rice long terminal repeat retrotransposons? Genome Res. 2009;19:2221–2230. doi: 10.1101/gr.083899.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Wang Y, Tang H, DeBarry JD, et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 2012;407:e49. doi: 10.1093/nar/gkr1293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Wicker T, Buchmann JP, Keller B. Patching gaps in plant genomes results in gene movement and erosion of colinearity. Genome Res. 2010;20:1229–1237. doi: 10.1101/gr.107284.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Wicker T, Mayer KFX, Gundlach H, et al. Frequent gene movement and pseudogene evolution is common to the large and complex genomes of wheat, barley and their relatives. Plant Cell. 2011;23:1706–1718. doi: 10.1105/tpc.111.086629. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Wolfe KH. Robustness—it's not where you think it is. Nat. Genet. 2000;25:3–4. doi: 10.1038/75560. [DOI] [PubMed] [Google Scholar]
  54. Wolfe KH. Yesterday's polyploids and the mystery of diploidization. Nat. Rev. Genet. 2001;2:333–341. doi: 10.1038/35072009. [DOI] [PubMed] [Google Scholar]
  55. Woodhouse MR, Schnable JC, Pederson BS, Lyons E, Lisch D, Subramaniam S, Freeling M. Following tetraploidy in maize, a short deletion mechanism removed genes preferentially from one of two homeologs. PLoS Biol. 2010;8:e10000409. doi: 10.1371/journal.pbio.1000409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Wright SI, Foxe JP, DeRose-Wilson L, Kawabe A, Looseley M, Gaut BS, Charlesworth D. Testing for effects of recombination rate on nucleotide diversity in natural populations of Arabidopsis lyrata. Genetics. 2006;174:1421–1430. doi: 10.1534/genetics.106.062588. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Wu Y, Kikuchi S, Yan H, Zhang W, Rosenbaum H, Iniguez L, Jiang J. Euchromatic subdomains in rice centromeres are associated with genes and transcription. Plant Cell. 2011;23:4054–4064. doi: 10.1105/tpc.111.090043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Yang Z. PAML 4: a program package for phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 2007;24:1586–1591. doi: 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
  59. Yang L, Gaut BS. Factors that contribute to variation in evolutionary rate among Arabidopsis genes. Mol. Biol. Evol. 2011;28:2359–2369. doi: 10.1093/molbev/msr058. [DOI] [PubMed] [Google Scholar]
  60. Yang J, Gu Z, Li WH. Rate of protein evolution versus fitness effect of gene deletion. Mol. Biol. Evol. 2003;20:772–774. doi: 10.1093/molbev/msg078. [DOI] [PubMed] [Google Scholar]
  61. Yin BL, Guo L, Zhan DF, Terzaghi W, Wang XF, Liu TT, Hea H, Cheng ZK, Deng XW. Integration of cytological features with molecular and epigenetic properties of rice chromosome 4. Mol. Plant. 2008;1:816–829. doi: 10.1093/mp/ssn037. [DOI] [PubMed] [Google Scholar]
  62. Zhang L, Gaut BS. Does recombination shape the distribution and evolution of tandemly arrayed genes (TAGs) in the Arabidopsis thaliana genome? Genome Res. 2003;13:2533–2540. doi: 10.1101/gr.1318503. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1

The low recombining peri-centromeric (LR-PC) region of barley.

Figure S2. Selection of highly diverse H. spontaneum lines from the World Barley Diversity Collection by principal coordinate analysis of high throughput SNP marker data.

Figure S3. Diversity and recombination statistics for barley chromosomes.

Figure S4. Developmental and tissue-specific RNA expression levels are independent of LR-PC region residency.

Figure S5. πa and πs statistics among 14 diverse lines across the H. spontaneum genome.

Figure S6. Identification of WGD-derived paralogous regions in the barley genome by visualization of BLAST data.

Figure S7. Using synteny conservation between barley and Brachypodium to order the barley genome.

Figure S8. Acquisition of synteny-supported barley genes using Brachypodium conserved synteny – Chromosome B3H as an example.

Figure S9. Barley paralogy plots for genes sharing conserved synteny with Brachypodium.

Figure S10. Selection of high-confidence, WGD-derived barley paralog pairs.

Figure S11. Nucleotide substitution data for barley ohnologs.

Figure S12. Local gene duplication densities along barley chromosomes.

tpj0079-0981-sd1.docx (5.9MB, docx)
Table S1

Genetic map locations and mapped gene contents for LR regions of the barley genome.

Table S2. H. spontaneum gene diversity and selection statistics.

Table S3. Barley ohnolog gene pairs.

Table S4. Gene ontology terms enriched in barley ohnologs.

Table S5. Analysis of ohnolog gene expression bias by ohnolog region.

Table S6. Distributions of ohnologs and ohnolog pairs for Brachypodium and rice by genome compartment.

Table S10. Shared synteny blocks between Brachypodium and barley genomes.

tpj0079-0981-sd2.docx (120KB, docx)
Table S7

Brachypodium ohnolog MCScanX data.

tpj0079-0981-sd3.xlsx (386.6KB, xlsx)
Table S8

Rice ohnolog MCScanX data.

tpj0079-0981-sd4.xlsx (921.6KB, xlsx)
Table S9

RNA-seq data, with gene assignations and map positions.

tpj0079-0981-sd5.xlsx (10.6MB, xlsx)
Appendix S1

MCScanX analysis.

tpj0079-0981-sd6.docx (99.7KB, docx)
tpj0079-0981-sd7.docx (25.5KB, docx)

Articles from The Plant Journal are provided here courtesy of Wiley

RESOURCES