Abstract
Advances in our understanding of genome structure provide consistent evidence for the existence of a core genome representing species classically defined by phenotype, as well as conditionally dispensable components of the genome that shows extensive variation between individuals of a given species. Generally, conservation of phenotypic features between species reflects conserved features of the genome; however, this is evidently not necessarily always the case as demonstrated by the analysis of the tunicate chordate Oikopleura dioica. In both plants and animals, the methylation activity of DNA and histones continues to present new variables for modifying (eventually) the phenotype of an organism and provides for structural variation that builds on the point mutations, rearrangements, indels, and amplification of retrotransposable elements traditionally considered. The translation of the advances in the structure/function analysis of the genome to industry is facilitated through the capture of research outputs in “toolboxes” that remain accessible in the public domain.
Keywords: Genes, Genome, Methylation, Conditionally dispensable genome regions
Introduction
This mini-review is based on presentations and summaries presented at Plant and Animal Genome (PAG) XXI, in San Diego in January 2013. We note that where unpublished information is cited, permission to include the information in this manuscript was obtained from the presenters. The PAG meeting covers a very broad range of data exchange through many workshops and industry exhibitions, and the invited plenary lectures within a particular domain provide a particularly good snapshot of areas that are exciting and have relevance across the broader plant and animal disciplines. In the area of genome evolution and population level studies, plenary lectures were provided by Michele Morgante (University of Udine, Italy), Greger Larson (Durham University, UK) and Daniel Chourrout (Centre for Marine Biology, Norway); in the area of translating genomics to industry and modifying specific traits, Gary Muehlbauer (University of Minnesota, USA) provided insights into the cereals; in the area of analysing changes in gene expression of organs, Steven Jacobssen provided an extensive overview; in the area of informatics technology and communication, Eric Perakslis (US FDA, USA), Michael B Eisen (UC-Berkeley, USA) and Goncalo Abecasis (University of Michigan, USA) presented their views and achievements.
Analysing the basis for change in gene expression
The plenary lecture by Steven Jacobssen reviewed the status of epigenetic modification in chromatin and the general issue of gene silencing in plants. The methylation of cytosine (Cm, the “fifth base”) residues is now recognised as a major variable in the control of gene expression and assaying Cm can be carried out as a part of a genome sequencing project using bisulphite sequencing (Cokus et al. 2008). The Cm-containing genome sequences include CG, methylated by METHYLTRANSFERASE 1 (MET1); CHG, methylated by CHROMOMETHYLASE3 (CMT3); and CHH methylated by DOMAINS REARRANGED METHYLASE 2 (DRM2)1. The DRM2 enzyme is required for the methylation of the cytosines in all the sequences. The methyltransferase, kryptonite (KYP/SUVH4), is responsible for the dimethylation of histone H3 and also binds to methylated DNA to provide an association between DNA and histone methylation that is similar to that observed in animals (Johnson et al. 2007). Specific regions of the genomic DNA such as retrotransposable elements can be methylated through an RNA directed pathway involving RNA polymerases IV and V, plus small interfering RNAs (Zhong et al. 2012). Demethylation of histones (Krichevsky et al. 2011) and specific loci in genomic DNA (Penterman et al. 2007) are additional variables in the control of gene expression. In a genome-wide analysis, Stroud et al. (2013) mapped the variation in methylation at the single nucleotide level within the Arabidopsis genome, in response to mutations in a set of 86 genes involved in the gene silencing pathway. As expected, mutation in MET1 eliminated CG methylation while mutations in the genes VARIANT IN METHYLATION (VIM1, VIM2 and VIM3) dramatically affected methylation but showed functional redundancy within the group. The gene DECREASE IN DNA METHYLATION 1 was associated with methylating mainly DNA in heterochromatin. The methylation of CHG, again as expected, was depleted by mutation in CMT3. Mutations in KYP, SUVH5 and SUVH6 showed similar effects when compared to each other. The mutations did not however show a uniform loss of CHG methylation and it was evident that the DRM1/2 genes also defined sites of methylation that did not overlap CMT3 targets. The methylation of the CHH motif was found to be closely linked to methylation of CHG but differentiation between the sites was found in that KYP SUVH5/6-regulated CHH methylation in a RNAi-dependent manner. The gene networks controlling DNA methylation included genes controlling the RNAi pathway as well as those modifying chromatin structure. It was evident that new factors and gene networks controlling DNA methylation remain to be discovered (Berdasco et al. 2008; Stroud et al. 2013).
Changes in DNA methylation in response to biotic stress (Dowen et al. 2012) and in the tissue culture of cells (Cheng et al. 2006) have been reported. In rice, the Tos17-LTR retrotransposable elements that are activated by the tissue culture process are undermethylated, in contrast to plants regenerated from tissue culture where the Tos17 elements are successively methylated again with each generation (Cheng et al. 2006). Loss of function of the SET DOMAIN GROUP PROTEIN (Baumbusch et al. 2001; Caro et al. 2012) in rice (SG714) decreases DNA methylation and increases the transposition of Tos17 elements (Ding et al. 2007). In the case of leaf cells of Arabidopsis reacting to the biotrophic pathogen Pseudomonas syringae, the fine mapping of changes in methylation of the genome (Dowen et al. 2012) has provided evidence for localised changes in methylation. A prominent phenotype is cell death in response to the pathogen. Although the overall distribution of Cm across the genome was similar before and after infection, a detailed analysis showed that changes in methylation occurred differentially in gene-rich regions compared to the rest of the genome with a peak in differentially methylated sites within 1 kb upstream from the start of transcription. A feature of the distribution was the differential methylation of transposable elements near protein coding genes (Dowen et al. 2012). The response of tissue to salicylic acid, rather than P. syringae, included more transposable elements and suggested that, on a broad scale, the programming of DNA methylation is integral to the control of gene expression.
Genome evolution and population level studies
Daniel Chourrout discussed the coastal marine planktonic chordate Oikopleura dioica in his plenary lecture. The genome and transcriptome resources for this organism are well established in the form of OikoBase (Danks et al. 2012) and early studies on ribosomal protein, EF-1a, Hox proteins and tubulin gene families (Edvardsen et al. 2004) indicated that these invertebrates have a very compact genome (18,020 predicted genes in 70 Mb, Denoeud et al. 2010) with short introns at variable positions within genes. The Hox genes, important in development, were not clustered as in most organisms studied to date (Seo et al. 2004). The genome is housed within three chromosomes plus an X and Y chromosome and has been found to be under-methylated compared to other species. The striking feature of O. dioica is that key elements of its genome, in an evolutionary context, are unique to the species even though its phenotype provides the basis for its unambiguous classification as a tunicate chordate (Stach et al. 2008). Most transposable superfamilies of retrotransposable elements are missing from the O. dioica genome and synteny in chromosomal gene order to organisms such as Amphioxus, Ciona, Caenorhabditis, and sea anemone that are related in phenotype has been lost (Denoeud et al. 2010). A striking contrast exists within the O. dioica genome between the basic compact structure of most of the genome and the structure of the Y chromosome which shows very large introns in the genes housed within this chromosome (Denoeud et al. 2010). The studies highlight the significant gaps in our understanding in relating genome structure to function/phenotype in an organism.
DNA repair genes are missing in the O. dioica genome which is consistent with the high mutation rates deduced from a comparison of the genome sequences from populations from the eastern Atlantic and eastern Pacific oceans (Denoeud et al. 2010). Mutations in non-silent sites of genes were low compared to changes in silent sites in genes, consistent with strong selection pressures (Denoeud et al. 2010). It appears that in the context of the core and conditionally dispensable regions of genome discussed below (following paragraph), the core elements of the O. dioica genome have changed extensively and that new DNA sequences have been recruited from the variable, conditionally dispensable, regions of the genome to provide the basis for retaining the evolutionary conserved phenotypes that characterise tunicate chordates.
Genome level of analysis of individuals within populations of higher eukaryotes has also modified the way genomes are perceived. In the human genome (PAG 2010 in Appels et al. 2010), the analyses to complete linear DNA sequences for each chromosome has demonstrated that a representative single reference genome sequence is not feasible and, instead, a more complex view is required to show INDEL polymorphisms, small inversions and duplications in order to provide a template for analysing SNPs. In his plenary lecture at the PAG, Michele Morgante discussed the pan-genome concept developed first in bacteria (Tettelin et al. 2005) as a basis for considering a genome as consisting of core and conditionally dispensable elements (Morgante et al. 2007). In bacteria, Tettelin et al. (2005) compared the genome sequences of six strains of Streptococcus agalactiae to those available in databases and found that 80 % of the genome sequences could be assigned to a core genome. The remaining 20 % (referred to as dispensable) was found to be highly variable and included strain-specific genes, and was suggested to provide a reservoir of genes for modifying the pan-genome. Mobile and extrachromosomal elements were found to be prominent in the dispensable part of the genome. Based on the frequency of discovery of unique genes in this dispensable part of the genome, Tettelin et al. (2005) suggested that the total gene complement of a species may in fact be difficult to determine because new genes can be continually acquired into this part of the genome through lateral gene transfer.
In eukaryotes, the properties of the conditionally dispensable elements of the genome were classically illustrated in the control of gene expression by heterochromatin in Drosophila (Hilliker and Appels 1982). Regions of chromosomes defined, by cytology, as heterochromatin contain long intergenic tracts of simple sequence repeats and can moderate gene expression depending on their position within the genome (Hilliker and Appels 1982). The long tracts of gene-free genome sequences are not actually essential for life as demonstrated experimentally by generating large deletions in mice (Nobrega et al. 2004) and can therefore be defined as conditionally dispensable. A property of conditionally dispensable parts of the genome noted by Morgante would include the co-option of DNA sequences which contribute to differentiating individuals within a species. These regions of the genome that differentiate individuals could also contribute to heterosis/hybrid vigour (reviewed in Springer and Stupar 2007a) since hybrids show non-additive gene expression, outside the range of the parents. If the elements and strain-specific genes within conditionally dispensable regions of the genome contribute to heterosis through processes of complementation, it would be as part of a suite of molecular mechanisms ranging from new protein–protein interactions to new epigenetic states within the hybrids (Springer and Stupar 2007a, b).
The database of DNA sequence information from the genomes of plants and animals is rapidly expanding and in his plenary lecture Morgante showed how this provides a detailed view of the conditionally dispensable parts of the genome focused on the analysis of Zea mays (maize) and Vitis vinifera (grape) genomes. The whole genome comparison between the maize lines Mo17 and B73 (Brunner et al. 2005; Morgante et al. 2005) provided evidence for extensive differences between the two genomes driven by changes in the retrotransposable element component of the genome, even though crosses between lines are fertile. The regions of the genome not shared between the lines amounted to 50 % of the total sequence space. The fragments of genes in these highly polymorphic regions included sequences located in helitron transposable elements that were distributed among the different maize chromosomes, as determined from analysing oat–maize chromosome addition lines (Morgante et al. 2005). Regions of gene clusters tended to be restricted in their distribution within the genome compared to single exons. The more detailed comparison of specific loci on chromosomes 1S, 1L, 2S, 7S and 9S by Brunner et al. (2005) between Mo17 and B73 showed that colinearity between the genome regions was fragmented due to the insertion of long terminal repeat (LTR) retrotransposons and gene fragments. The age of these inserted LTRs was more recent than that of the colinear parts of the genome. The effective population size of the non-colinear or dispensable, sequence segments have been noted to differ from that of the core genome regions (Brunner et al. 2005) and where they are linked closely to genes affecting agronomically important traits would be expected to have significant effects on breeding. In addition to qualitative differences between related chromosome regions, quantitative differences in sequences >1 kb in size (CNVs) relative to B73, for 13 maize lines (including Mo17), have been determined using an array-based comparative genome hybridisation approach (Belo et al. 2010a). Approximately half of the 2,109 dispersed and clustered CNVs occurred in only one of the maize lines and have been considered to be potential contributors to heterosis (Belo et al. 2010b).
The advances in the DNA sequence-based characterization of the 14 varieties of grapes discussed by Morgante was aided by the availability of an extensive description of the grape genome (Jaillon et al. 2007; http://www.genoscope.cns.fr/externe/GenomeBrowser/Vitis/). The 19 Vitis chromosomes are represented today by a genome assembly of 33 mapped ultracontigs (N50 = 23 Mb) that provide a 91.2 % coverage of the genome. Historically, the domestication and development of grape varieties is closely linked to human settlement and agricultural practises, from the southern borders of the Black and Caspian seas to Afghanistan, ca 4000 BC (Olmo 1976). The transfer of grape plants to environments beyond this natural range led to hybridisation to wild Vitis species and produced new lines better adapted to the local environment (Cipriani et al. 2010). It is estimated that 10,000 varieties derive from the V. vinifera species originally domesticated in the Middle East (Olmo 1976). Genetic heterozygosity is a feature of this crop (Scalabrin et al. 2011). The resequencing analysis of the 14 grape varieties reported by Morgante used paired end sequencing and software such as BreakDancer (Chen et al. 2009) and DNACopy (Venkatraman and Olshen 2007), for analysing structural variation between the genome sequences. A total of 36,000 INDELS, 1–25 kb in length, were identified and it was deduced from the analyses that these were mainly due to transposition events resulting from the movement of Copia and Gypsy elements. The LINE elements were particularly polymorphic and it was found that these polymorphic elements were prominent in introns. An additional 147 Mbp of large deletions were identified with a depth of coverage approach. The genes in the stilbene (Vannozzi et al. 2012), terpene (Martin et al. 2010) and flavonoid pathways (Falginella et al. 2010) were used to illustrate the variation found. In the anthocyanin pathway, Kobayashi et al. (2004) for example demonstrated that a retrotransposon (Gret1, 10,422 bp) insertion into the promoter region of the transcription factor VvmybA1, was associated with the loss of pigmentation in white cultivars of V. vinifera. This characterization of the red vs white skin colour of grapes was consistent with the importance of transposition events in the domestication of grapes. A recent transcriptome analysis of V. vinifera cv Corvina (Venturini et al. 2013) identified 180 genes in the conditionally dispensable space of the genome, with 50 being differentially expressed.
The role of introgression to build up the conditionally dispensable regions of genomes, in the way indicated above for maize and grapes, throughout the history of domestication was discussed in detail by Greger Larson. The availability of a large, and rapidly expanding, databases of plant and animal species used as a food source provide a basis for defining the role of introgression in the history of domestication. In particular, Larson discussed the use of the DNA sequence databases to quantify variation and estimate the start of the domestication process while accepting that the process does not have an end and continues to this day (Larson and Burger 2013). Three separate domestication pathways have been defined by Zeder (2012) and have facilitated the development of population models that define the duration of the early capture period, the presence and size of bottlenecks, and the number and geographic distribution of potential ancestral populations. In the “commensal pathway”, a close link to the advent of agriculture is usually evident followed by increasing degrees of deliberate human action as the relationship between humans and the respective plants and animals increased (Larson and Burger 2013). Animals that were initially predated upon by people are included in the “prey pathway” where animals that were first hunted away from human settlements were subsequently more directly managed as they were brought into closer proximity with people. The population structure most often indicates a major bottleneck and short time frame for change. Plants and animals that followed the “directed pathway” tended to do so after a long time within the agriculture system and the process bypasses the early phases of habituation and management and begins with the collection of plants and the capture of wild animals with the deliberate intention of controlling their breeding. This directed pathway is accompanied by a dramatic bottleneck.
In the commensal and prey pathways, admixtures of populations are an important feature of the domestication process. The establishment of a high quality (BAC-based) genome reference sequence for pigs (Groenen et al. 2012) has allowed the analysis of the features of their domestication. Studies using nuclear DNA sequences (Groenen et al. 2012) and mitochondrial DNA sequences from both ancient DNA and present-day samples by Larson et al. (2010) have provided evidence consistent for the early evolution of Sus scrofa in the Island South East Asia (ISEA) region followed by migration into the rest of the Asia region and Eurasia. The Chinese domestic pig was argued to be a direct descendant from this ISEA centre of diversity (Larson et al. 2010). Evidence for other centres of domestication in the Indo-Burma and mountainous South East Asia regions (Larson et al. 2010; Charoensook et al. 2011) and a number of centres in Eurasia (Groenen et al. 2012) form the basis for the network of domestication. Bottlenecks in the pig lineages due to climatic change also need to be considered together with extinction events such as that of a Pacific clade haplotype in South East Asia (Larson et al. 2010). Admixture analyses carried out by Groenen et al. (2012) on nuclear DNA sequences within S. scrofa lineages demonstrated gene flow between the northern Chinese and European populations and varying degrees of exchange between domesticated pig lineages and their wild relatives. The latter was consistent with the semi-managed state in which pigs were kept in the early agricultural communities (Groenen et al. 2012).
The analyses of other species closely associated with human societies (Larson et al. 2012; Larson and Burger 2013) consistently indicate that initial domestication processes are clearly distinguishable from subsequent movement and admixture with local wild populations. It was also evident that large-scale genome sequence analyses will continue to impact on defining the details of the co-evolution of human societies and groups of plants and animals. In rice and other cereals, the extensive genome sequencing is leading to better utilisation of wild relatives for crop improvement even though comparisons between the different Oryza genomes (for example) have demonstrated extensive DNA rearrangement in the colinearity of genes (Tian et al. 2011). Comparisons between the D genome donor to bread wheat and barley show conservation of gene order (Fig. 1) against a background of structural variation that is known to exist. For both rice and bread wheat, the wild relatives provide an extensive array of new gene alleles, and genes per se, for cultivar improvement (Kovach and McCouch 2008; Trethowan and Mujeeb-Kazi 2008; Appels et al. 2012) in the directed pathway category of the domestication process discussed by Larson.
Translation of genomics to industry and modifying specific traits
Genome sequencing of cereals is now maturing as the BAC-based sequence assemblies for rice (rgp.dna.affrc.go.jp/IRGSP), maize (www.maizegdb.org) and barley (International Barley Genome Sequencing Consortium, IBGSC 2012) are being utilised across a broad range of applications, with key outputs being the projection of phenotypic traits important to the industry onto the genome DNA sequence. For the hexaploid wheat (IWGSC—www.wheatgenome.org; Brenchley et al. 2012) and the A genome (Ling et al. 2013) and D genome (Jia et al. 2013; see also Fig. 1) diploid genome donors, the whole genome sequencing technology has provided an important step in assigning genome sequences to molecular genetic maps and traits of agronomic significance (Feuillet et al. 2012). In his plenary presentation, Gary Muehlbauer provided a focus on barley molecular genetics with particular reference to Fusarium head blight (FHB) resistance and integrating genomics resources and genetic variation in wild barleys into barley breeding programmes. The current physical/sequence map for barley comprises 4.98 Gb with 79,379 transcript clusters identified through alignments with cDNA and RNA-seq data. The annotated genes include 26,159 genes that were supported by homology to genes in other plant genomes (for example see Fig. 1; Middleton et al. 2012). Based on RNA-seq data, 55 % of the genes classified as high confidence showed evidence for alternative splicing (IBGSC 2012) and this was argued to represent a significant variable in linking gene expression to the final phenotype. The retrotransposable element/repetitive sequence content of the genome was estimated to represent 84 % of the total DNA sequence. In addition to the mapping of single nucleotide polymorphisms by sequencing (IBGSC 2012), surveys of wild and cultivated barleys by comparative genome hybridisation arrays (Muñoz-Amatriaín et al. 2010) have also demonstrated that a significant proportion (15 %) of the barley genome is affected by copy number variation (CNV) in DNA sequences. The telomeric regions of chromosomes were enriched for CNVs and correlated with an enrichment of genes in these regions. It is possible that the CNVs contribute to the phenotypic diversity of barleys (Muñoz-Amatriaín et al. 2010) and may relate to the conditionally dispensable regions of the genome discussed earlier.
In order to translate the basic findings of the barley genome into industry, Muehlbauer argued that contemporary breeders utilised molecular data to predict phenotypes that are of value in novel germplasm for increasing the speed and efficiency of their programmes as well as reducing costs (Waugh et al. 2010; Comadran et al. 2011; Okagaki et al. 2012; Blake et al. 2012; Berger et al. 2012). The outputs from research and germplasm analysis projects in barley are made available to the industry through The Hordeum Toolbox (THT) which integrates extensive phenotypic and genotypic data sets for further downstream analyses (Szűcs et al. 2009; Blake et al. 2012). THT was argued to facilitate the sharing of data between breeding programmes. The analysis of a complex trait such FHB has necessitated the analysis of a broad range of variables ranging from defining resistance haplotypes in wild germplasms through to defining QTL for FHB resistance on chromosome 2H and 6H in new molecular marker genetic maps (Huang et al. 2013). The analysis uncoupled FHB resistance from head phenotype (two rows vs six rows, Cuesta-Marcos et al. 2010) and identified novel alleles for the known and robust FHB QTL and provided a good example for the broad integration of technologies. Progressive use of Genome Wide Association Studies (GWAS, Berger et al. 2013; Cuesta-Marco et al. 2010) expands the database of molecular information linked to phenotype and THT provides a valuable model for translating these data to industry.
Informatics technologies and communication
The plenary lecture by Goncalo Abecasis provided insights into the computational analysis of complex disease loci and finding rare variants associated with disease in human populations. The ability to attribute or impute genome variation to a particular disease has improved as the data set of genome sequences has increased (The 1000 Genomes Project Consortium 2012). A specific example was discussed related to macular degeneration (Li et al. 2006b; Rachauduri et al. 2011) and its association with a ca 70 kb deletion on the long arm of chromosome 1 (CFHR1-CFHR3) and SNPs in a closely linked complement factor H locus (CFH). Overall, the analysis indicated that the CFHR1-CFHR3 deletion and CFH locus accounted for “modest ” levels of variation in the phenotype, consistent with the possibility of multiple susceptibility alleles in the region of chromosome 1 that was analysed (Li et al. 2006a, b). In general, it is apparent that integrating additional knowledge related to the ancestry of individuals analysed and biological attributes of the trait studied (Li et al. 2006a; Chasman et al. 2012) is important in utilising GWAS for defining loci contributing to complex traits. In addition, DNA sequence information from more individuals would increase the power of detecting rare alleles. The challenge of attaining a balance between cost of sequencing with respect to genome coverage, genotype calling and statistical power of the output data for GWAS was discussed by Abecasis from the computation point-of-view through the use of the AbCD (Kang et al. 2013) and Triocaller (Chen et al. 2013) software.
The plenary lectures by Eric Perakslis and Michael Eisen debated different aspects of freedom to access information for scientific research. In the Food and Drug Administration of the USA, Perakslis described the large scale of the records that are kept in relation to food and drug imports and the use of PREDICT software (www.fda.gov/Forindustry/Importprogram/ucm172743) to handle the risk-based screening procedures and anticipating innovation. The database underpinning PREDICT required the details of contents including points of origin of the different components within food and drug products and was a potentially valuable resource for pre-competitive research related to food security. It was noted by Perakslis that this level of data sharing required an “honest broker” within the frame work of the Federal Information Security Management Act. The issue of data sharing and data access was also addressed by Michael Eisen. The focus for Eisen was the contrasting situations that related to DNA, RNA and protein sequence databases which were free to access and the access to scientific text in the form of publications that was generally only accessible after payment of a fee. The argument was made that in an area of science such as represented by the PAG, sophisticated search engines equivalent to BLAST for DNA, RNA and protein sequences should be available for words/concepts in order to enrich the interpretation of complex data sets and draw on observations and conclusions from the analysis of organisms outside a particular focused area of interest.
Acknowledgments
The authors acknowledge the organisers of the Plant and Animal Genome for bringing together a stimulating conference in January each year and that the content of this meeting serves to provide a snapshot of developments in the genomics area as a basis for this mini-review.
Footnotes
The H designation represents the bases A, C or T
References
- Appels R, Barrero R, Bellgard M. Advances in biotechnology and linking outputs to variation in complex traits: plant and Animal Genome meeting January 2012. Funct Integr Genomics. 2012;12:1–9. doi: 10.1007/s10142-012-0270-7. [DOI] [PubMed] [Google Scholar]
- Appels R, Barrerro R, Keeble G, Bellgard M. Advances in genome studies: the PAG 2010 conference. Funct Integr Genomics. 2010;10:1–9. doi: 10.1007/s10142-010-0164-5. [DOI] [PubMed] [Google Scholar]
- Baumbusch LO, Thorstensen T, Krauss V, Fischer A, Naumann K, Assalkhou R, Schulz I, Reuter G, Aalen RB. The Arabidopsis thaliana genome contains at least 29 active genes encoding SET domain proteins that can be assigned to four evolutionarily conserved classes. Nucleic Acids Res. 2001;29:21. doi: 10.1093/nar/29.21.4319. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Belo A, Beatty MK, Hondred D, Fengler KA, Bailin Li B, Rafalski A. Allelic genome structural variations in maize detected by array comparative genome hybridization. Theor Appl Genet. 2010;120:355–367. doi: 10.1007/s00122-009-1128-9. [DOI] [PubMed] [Google Scholar]
- Belo, et al. TAG. High resolution mapping of genomic structural variation. Nat Methods. 2010;6:677–681. doi: 10.1038/nmeth.1363. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berdasco M, Alcazar R, Garcia-Ortiz MV, Ballestar E, Fernandez AF, Roldan-Arjona T, Tiburcio AF, Altabella T, Buisine N, Quesneville H, Baudry A, Lepiniec L, Alaminos M, Rodriguez R, Lloyd A, Colot V, Bender J, Canal MJ, Esteller M, Fraga MF. Promoter DNA hypermethylation and gene repression in undifferentiated Arabidopsis cells. PLoS One. 2008;3:e3306. doi: 10.1371/journal.pone.0003306. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Berger GL, Liu S, Hall MD, Brooks WS, Chao S, Muehlbauer GJ, Baik BK, Brian Steffenson B, Griffey CA. Marker-trait associations in Virginia Tech winter barley identified using genome-wide mapping. Theor Appl Genet. 2013;126:693–710. doi: 10.1007/s00122-012-2011-7. [DOI] [PubMed] [Google Scholar]
- Berger GL, Liu S, Hall MD, Brooks WS, Chao S, Muehlbauer GJ, Baik BK, Steffenson B, Griffey CA. Marker-trait associations in Virginia Tech winter barley identified using genome-wide mapping. Theor Appl Genet. 2012;126(3):693–710. doi: 10.1007/s00122-012-2011-7. [DOI] [PubMed] [Google Scholar]
- Blake VC, Kling JG, Hayes PM, Jannink J-L, Jillella SR, Lee J, Matthews DE, Chao S, Close TJ, Muehlbauer GJ, Smith KP, Wise RP, Dickerson JA. The Hordeum Toolbox: the barley coordinated agricultural project genotype and phenotype resource. Plant Genome. 2012;5:81–91. doi: 10.3835/plantgenome2012.03.0002. [DOI] [Google Scholar]
- Brenchley R, Spannagl M, Pfeifer M, Barker GLA, D’Amore R, Allen AM, McKenzie N, Kramer M, Kerhornou A, Bolser D, Kay S, Waite D, Trick M, Bancroft I, Gu Y, Huo N, Luo M-C, Sehgal S, Gill BS, Kianian S, Anderson O, Kersey P, Dvorak J, McCombie WR, Hall A, Mayer KFX, Edwards KJ, Bevan MW, Hall N. Analysis of the bread wheat genome using whole-genome shotgun sequencing. Nature. 2012;491:705–710. doi: 10.1038/nature11650. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brunner S, Fengler K, Morgante M, Tingey S, Rafalski A. Evolution of DNA sequence nonhomologies among maize inbreds. Plant Cell. 2005;17:343–360. doi: 10.1105/tpc.104.025627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Caro E, Stroud H, Greenberg MVC, Bernatavichute YV, Suhua Feng S, Groth M, Vashisht AA, Wohlschlegel J, Jacobsen SE. The SET-domain protein suvr5 mediates h3k9me2 deposition and silencing at stimulus response genes in a DNA methylation-independent manner. PLoS Genet. 2012;8(10):e1002995. doi: 10.1371/journal.pgen.1002995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Charoensook R, Brenig B, Gatphayak K, Knorr C. Further resolution of porcine phylogeny in Southeast Asia by Thai mtDNA haplotypes. Anim Genet. 2011;42:445–450. doi: 10.1111/j.1365-2052.2011.02175.x. [DOI] [PubMed] [Google Scholar]
- Chasman DI, et al. Integration of genome-wide association studies with biological knowledge identifies six novel genes related to kidney function. Hum Mol Genet. 2012;21:5329–5343. doi: 10.1093/hmg/dds369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen K, Wallis JW, McLellan MD, Larson DE, Kalicki JM, Pohl CS, McGrath SD, Wendl MC, Zhang Q, Locke DP, Shi X, Fulton RS, Ley TJ, Wilson RK, Ding L, Mardis ER. BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods. 2009;6:677–681. doi: 10.1038/nmeth.1363. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen W, Li B, Zeng Z, Sanna S, Sidore C, Busonero F, Kang HM, Li Y, Abecasis GR. Genotype calling and haplotyping in parent–offspring trios. Genome Res. 2013;23:142–151. doi: 10.1101/gr.142455.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cheng C, Daigen M, Hirochika H. Epigenetic regulation of the rice retrotransposon Tos17. Mol Gen Genomics. 2006;276:378–390. doi: 10.1007/s00438-006-0141-9. [DOI] [PubMed] [Google Scholar]
- Cipriani G, Spadotto A, Jurman I, Di Gaspero G, Crespan M, Meneghetti S, Frare E, Vignani R, Cresti M, Morgante M, Pezzotti M, Pe E, Policriti A, Testolin R. The SSR-based molecular profile of 1005 grapevine (Vitis vinifera L.) accessions uncovers new synonymy and parentages, and reveals a large admixture amongst varieties of different geographic origin. Theor Appl Genet. 2010;121:1569–1585. doi: 10.1007/s00122-010-1411-9. [DOI] [PubMed] [Google Scholar]
- Cokus SJ, Feng S, Zhang X, Chen Z, Barry Merriman B, Haudenschild CD, Pradhan S, Nelson SF, Pellegrini M, Jacobsen SE (2008) Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Vol 452 [DOI] [PMC free article] [PubMed]
- Comadran J, Ramsay L, MacKenzie K, Hayes P, Close TJ, Muehlbauer G, Stein N, Waugh R. Patterns of polymorphism and linkage disequilibrium in cultivated barley. Theor Appl Genet. 2011;122:523–531. doi: 10.1007/s00122-010-1466-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cuesta-Marco A, Szűcs P, Close TJ, Filichkin T, Muehlbauer GJ, Smith KP, Hayes PM. Genome-wide SNPs and re-sequencing of growth habit and inflorescence genes in barley: implications for association mapping in germplasm arrays varying in size and structure. BMC Genomics. 2010;11:707. doi: 10.1186/1471-2164-11-707. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Danks G, Campsteijn CC, Parida M, Butcher S, Doddapaneni H, Fu B, Petrin R, Metpally R, Lenhard B, Wincker P, Chourrout D, Thompson EM, Manak JR. OikoBase: a genomics and developmental transcriptomics resource for the urochordate Oikopleura dioica. Nucl Acids Res. 2012;41:D845–D853. doi: 10.1093/nar/gks1159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Denoeud F, et al. Plasticity of animal genome architecture unmasked by rapid evolution of a pelagic tunicate. Science. 2010;330:1381–1385. doi: 10.1126/science.1194167. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ding Y, Wang X, Su LS, Zhai JX, Cao SY, Zhang DF, Liu CY, Bi YP, Qian Q, Cheng ZK, Chu CC, Cao XF. SDG714, a histone H3K9 methyltransferase, is involved in Tos17 DNA methylation and transposition in rice. Plant Cell. 2007;19:9–22. doi: 10.1105/tpc.106.048124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dowen RH, Pelizzola M, Schmitz RJ, Ryan Lister R, Dowen JM, Nery JR, Dixon JE, Ecker JR. Widespread dynamic DNA methylation in response to biotic stress. Proc Natl Acad Sci U S A. 2012;109(32):E2183–E2191. doi: 10.1073/pnas.1209329109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Edvardsen RB, Lerat E, Dorthea A, Maeland M, Flat M, Tewari R, Jensen MF, Lehrach H, Reinhardt R, Seo HC, Chourrout D. Hypervariable and highly divergent intron–exon organizations in the chordate Oikopleura dioica. J Mol Evol. 2004;59:448–457. doi: 10.1007/s00239-004-2636-5. [DOI] [PubMed] [Google Scholar]
- Falginella L, Castellarin SD, Testolin R, Gambetta GA, Morgante M, Di Gaspero G. Expansion and subfunctionalisation of flavonoid 3′,5′-hydroxylases in the grapevine lineage. BMC Genomics. 2010;11:562. doi: 10.1186/1471-2164-11-562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Feuillet C, et al. Integrating cereal genomics to support innovation in the Triticeae. Funct Integr Genomics. 2012;12(4):573–583. doi: 10.1007/s10142-012-0300-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Groenen MA, Swine Genome Sequencing Consortium Analyses of pig genomes provide insight into porcine demography and evolution. Nature. 2012;491:393–398. doi: 10.1038/nature11622. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hilliker AJ, Appels R (1982) Pleiotropic effects associated with the deletion of heterochromatin surrounding rDNA on the X chromosome of Drosophila. Chromosoma (Berl) 86:469–490 [DOI] [PubMed]
- Huang Y, Millett BP, Beaubien KA, Dahl SK, Steffenson BJ, Smith KP, Muehlbauer GJ. Haplotype diversity and population structure in cultivated and wild barley evaluated for Fusarium head blight responses. Theor Appl Genet. 2013;126:619–636. doi: 10.1007/s00122-012-2006-4. [DOI] [PubMed] [Google Scholar]
- International Barley Genome Sequencing Consortium et al. A physical, genetic and functional sequence assembly of the barley genome. Nature. 2012;491:711–716. doi: 10.1038/nature11543. [DOI] [PubMed] [Google Scholar]
- Jaillon O, French-Italian Public Consortium for Grapevine Genome Characterization et al. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature. 2007;449(7161):463–467. doi: 10.1038/nature06148. [DOI] [PubMed] [Google Scholar]
- Jia J, Shancen Zhao S, Kong X, Li Y, Zhao G, He W, Appels R, Pfeifer M, Tao Y, Zhang X, Jing R, Zhang C, Ma Y, Gao L, Gao C, Spannagl M, Mayer KFX, Dong Li D, Pan S, Fengya Zheng F, Hu Q, Xia X, Li J, Liang Q, Chen J, Wicker T, Gou C, Kuang H, He G, Luo Y, Keller B, Xia Q, Lu P, Wang J, Zou H, Zhang R, Gao J, Middleton C, Quan Z, Liu G, Wang J, IWGSC, Yang H, Xu Liu X, He Z, Mao L Wang J. (2013) The Aegilops tauschii draft genome sequence reveals a gene repertoire for wheat adaptation. Nature. doi:1038/nature12028 [DOI] [PubMed]
- Johnson LM, Bostick M, Zhang X, Kraft E, Henderson I, Judy Callis J, Jacobsen SE. The SRA methyl-cytosine-binding domain links DNA and histone methylation. Curr Biol. 2007;17:379–384. doi: 10.1016/j.cub.2007.01.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kang J, Huang KC, Xu Z, Wang Y, Abecasis GR, Li Y. (2013) AbCD: arbitrary coverage design for sequencing-based genetic studies. Bioinformatics. doi:10.1093/bioinformatics/btt041 [DOI] [PMC free article] [PubMed]
- Kobayashi S, Goto-Yamamoto N, Hirochika H. Retrotransposon-induced mutations in grape skin color. Science. 2004;304:982. doi: 10.1126/science.1095011. [DOI] [PubMed] [Google Scholar]
- Kovach MJ, McCouch SR. Leveraging natural diversity: back through the bottleneck. Curr Opin Plant Biol. 2008;11:193–200. doi: 10.1016/j.pbi.2007.12.006. [DOI] [PubMed] [Google Scholar]
- Krichevsky A, Adi Zaltsman A, Lacroix B, Citovsky V. Involvement of KDM1C histone demethylase-OTLD1 otubain-like histone deubiquitinase complexes in plant gene repression. Proc Natl Acad Sci U S A. 2011;108:11157–11162. doi: 10.1073/pnas.1014030108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Larson G, Burger J (2013). A population genetic view of animal domestication. Trends in Genetics. doi:10.1016/j.tig.2013.01.003 [DOI] [PubMed]
- Larson G, Karlsson EK, Perri A, Webster MT, Ho SYW, Peters J, Stahl PW, Piper PJ, Lingaas F, Fredholm M, Comstock KE, Modiano JF, Schelling C, Agoulnik AI, Leegwater PA, Dobney K, Vigne J-D, Vila C, Andersson L, Lindblad-Toh K. Rethinking dog domestication by integrating genetics, archeology, and biogeography. Proc Natl Acad Sci U S A. 2012;109:8878–8883. doi: 10.1073/pnas.1203005109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Larson G et al (2010) Patterns of East Asian pig domestication, migration, and turnover revealed by modern and ancient DNA. Proc Natl Acad Sci USA 107:7686–7691 [DOI] [PMC free article] [PubMed]
- Li B, Chen W, Zhan X, Busonero F, Sanna S, Carlo Sidore C, Cucca F, Kang HM, Abecasis GR. A likelihood based framework for variant calling and de novo mutation detection in families. PLoS Genet. 2006;8:e1002944. doi: 10.1371/journal.pgen.1002944. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li M, Atmaca-Sonmez P, Othman M, Branham KE, Khanna R, Wade MS, Li Y, Liang L, Zareparsi S, Swaroop A, Abecasis GR. CFH haplotypes without the Y402H coding variant show strong association with susceptibility to age-related macular degeneration. Nat Genet. 2006;38:1049–1054. doi: 10.1038/ng1871. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ling H-Q, Zhao S, Liu D, Wang J, Sun H, Zhang C, Fan H, Li D, Dong L, Tao Y, Gao C, Wu H, Li Y, Cui Y, Xiaosen Guo X, Zheng S, Wang B, Yu K, Liang Q, Wenlong Yang W, Lou X, Chen J, Feng M, Jian J, Zhang X, Luo G, Jiang Y, Junjie Liu J, Wang Z, Sha Y, Zhang B, Huajun Wu H, Dingzhong Tang D, Shen Q, Xue P, Zou S, Wang X, Liu X, Wang F, Yang Y, Xueli An X, Dong Z, Zhang K, Xiangqi Zhang X, Luo M-C, Dvorak J, Tong Y, Wang J, Yang H, Li Z, Wang D, Zhang A, Wang J. (2013) The Draft Genome of Triticum urartu 1—the progenitor of the wheat A genome. Nature. doi:105524/100050
- Martin DM, et al. Functional annotation, genome organization and phylogeny of the grapevine (Vitis vinifera) terpene synthase gene family based on genome assembly, FLcDNA cloning, and enzyme assays. BMC Plant Biol. 2010;10:226. doi: 10.1186/1471-2229-10-226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Middleton CP, Stein N, Keller B, Kilian B, Wicker T (2012) Comparative analysis of genome composition in Triticeae reveals strong variation in transposable element dynamics and nucleotide diversity. Plant J. doi:10.1111/tpj.12048 [DOI] [PubMed]
- Morgante M, Brunner S, Pea G, Fengler K, Zuccolo A, Rafalski A. Gene duplication and exon shuffling by helitron-like transposons generate intraspecies diversity in maize. Nat Genet. 2005;37:997–1002. doi: 10.1038/ng1615. [DOI] [PubMed] [Google Scholar]
- Morgante M, De Paoli E, Radovic S. Transposable elements and the plant pan-genomes. Curr Opin Plant Biol. 2007;10:149–155. doi: 10.1016/j.pbi.2007.02.001. [DOI] [PubMed] [Google Scholar]
- Muñoz-Amatriaín M, Xiong Y, Schmitt MR, Bilgic H, Budde AD, Chao S, Smith KP, Muehlbauer GJ. Transcriptome analysis of a barley breeding program examines gene expression diversity and reveals target genes for malting quality improvement. BMC Genomics. 2010;11:653. doi: 10.1186/1471-2164-11-653. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nobrega MA, Zhu Y, Plajzer-Frick I, Afzal V, Rubin EM. Megabase deletions of gene deserts result in viable mice. Nature. 2004;431:988–993. doi: 10.1038/nature03022. [DOI] [PubMed] [Google Scholar]
- Okagaki RJ, Cho S, Kruger WM, Xu WW, Heinen S, Muehlbauer GJ. (2012) The barley UNICULM2 gene resides in a centromeric region and may be associated with signaling and stress responses. Funct Integr Genomics. doi:10.1007/s10142-012-0299-7 [DOI] [PubMed]
- Olmo HP. Grapes: Vitis and Muscadinia. In: Simmonds NW, editor. Evolution of crop plants. London: Longman; 1976. pp. 294–298. [Google Scholar]
- Penterman J, Uzawa R, Fischer RL. Genetic interactions between DNA demethylation and methylation in Arabidopsis. Plant Physiol. 2007;145:1549–1557. doi: 10.1104/pp.107.107730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rachauduri S, Iartchouk O, Chin K, Tan P, Tai A, Ripke S, Gowrisankar S, Vemuri S, Montgomery K, Yu Y, Reynolds R, Zack DJ, Campochiaro P, Katsanis N, Daly M, Seddon JM. A rare penetrant mutation in CFH confers high risk of age-related macular degeneration. Nat Genet. 2011;43:1232–1238. doi: 10.1038/ng.976. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Scalabrin S, Troggio M, Moroldo M, Pindo M, Felice N, Coppola G, Prete G, Malacarne G, Marconi R, Faes G, Jurman I, Grando S, Jesse T, Segala C, Valle G, Policriti A, Fontana P, Morgante M, Velasco R. Physical mapping in highly heterozygous genomes: a physical contig map of the Pinot Noir grapevine cultivar. BMC Genomics. 2011;11:204. doi: 10.1186/1471-2164-11-204. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seo HC, Edvardsen RB, Maeland AD, Bjordal M, Jensen MF, Hanssen A, Flaat M, Weissenbach J, Lehrach H, Wincker P, Reinhardt R, Chourroutt D. Hox cluster disintegration with persistent anteroposterior order of expression in Oikopleura dioica. Nature. 2004;413:67–71. doi: 10.1038/nature02709. [DOI] [PubMed] [Google Scholar]
- Springer NM, Stupar RM. Allelic variation and heterosis in maize: how do two halves make more than a whole? Genome Res. 2007;17:264–275. doi: 10.1101/gr.5347007. [DOI] [PubMed] [Google Scholar]
- Springer NM, Stupar RM. Allele-specific expression patterns reveal biases and embryo-specific parent-of-origin effects in hybrid maize. Plant Cell. 2007;19:2391–2402. doi: 10.1105/tpc.107.052258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stach T, Jonas Winter J, Bouquet J-M, Chourrout D, Schnabel R. Embryology of a planktonic tunicate reveals traces of sessility. Proc Natl Acad Sci U S A. 2008;105:7229–7234. doi: 10.1073/pnas.0710196105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stroud H, Greenberg MVC, Feng S, Bernatavichute YV, Jacobsen SE. Comprehensive analysis of silencing mutants reveals complex regulation of the Arabidopsis methylome. Cell. 2013;152:352–364. doi: 10.1016/j.cell.2012.10.054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Szűcs P, Blake VC, Bhat PR, Chao S, Close TJ, Cuesta-Marcos A, Muehlbauer GJ, Ramsay L, Waugh R, Hayes PM. An integrated resource for barley linkage map and malting quality QTL alignment. Plant Genome. 2009;2:134–140. doi: 10.3835/plantgenome2008.01.0005. [DOI] [Google Scholar]
- Tettelin H, et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial “pan-genome”. Proc Natl Acad Sci. 2005;102:13950–13956. doi: 10.1073/pnas.0506758102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- The 1000 Genomes Project Consortium An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65. doi: 10.1038/nature11632. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tian Z, Yu Y, Lin F, Yu Y, Sanmiguel PJ, Wing RA, McCouch SR, Ma J, Jackson SA. Exceptional lability of a genomic complex in rice and its close relatives revealed by interspecific and intraspecific comparison and population analysis. BMC Genomics. 2011;12:142–154. doi: 10.1186/1471-2164-12-142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trethowan RM, Mujeeb-Kazi A. A novel germplasm resources for improving environmental stress tolerance of hexaploid wheat. Crop Sci. 2008;48:1255–1265. doi: 10.2135/cropsci2007.08.0477. [DOI] [Google Scholar]
- Vannozzi A, Dry IB, Fasoli M, Zenoni S, Lucchin M. Genome-wide analysis of the grapevine stilbene synthase multigenic family: genomic organization and expression profiles upon biotic and abiotic stresses. BMC Plant Biol. 2012;12:130. doi: 10.1186/1471-2229-12-130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Venkatraman E, Olshen A. A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics. 2007;23:657–663. doi: 10.1093/bioinformatics/btl646. [DOI] [PubMed] [Google Scholar]
- Venturini L, Ferrarini A, Sara Zenoni S, Tornielli GB, Fasoli M, Dal Santo S, Minio A, Buson G, Tononi P, Zago ED, Zamperin G, Bellin D, Pezzotti M, Delledonne M. De novo transcriptome characterization of Vitis vinifera cv. Corvina unveils varietal diversity. BMC Genomics. 2013;14:41. doi: 10.1186/1471-2164-14-41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waugh R, et al. Whole-genome association mapping in elite inbred crop varieties. Genome. 2010;53:967–972. doi: 10.1139/G10-078. [DOI] [PubMed] [Google Scholar]
- Zeder MA. The domestication of animals. J Anthropol Res. 2012;68:161–190. [Google Scholar]
- Zhong X, Hale CJ, Law JA, Johnson LM, Feng S, Tu A, Jacobsen SE. DDR complex facilitates global association of RNA polymerase V to promoters and evolutionarily young transposons. Nat Struct Mol Biol. 2012;19:870–875. doi: 10.1038/nsmb.2354. [DOI] [PMC free article] [PubMed] [Google Scholar]