Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Jul 2.
Published in final edited form as: Cell. 2015 Jul 2;162(1):23–32. doi: 10.1016/j.cell.2015.06.024

The Convergence of Systems and Reductionist Approaches in Complex Trait Analysis

Evan G Williams 1, Johan Auwerx 1,*
PMCID: PMC4493761  NIHMSID: NIHMS694936  PMID: 26140590

Abstract

Research into the genetic and environmental factors behind complex trait variation has traditionally been segregated into distinct scientific camps. The reductionist approach aims to decrypt phenotypic variability bit-by-bit, founded on the underlying hypothesis that genome-to-phenome relations are largely constructed from the additive effects of their molecular players. In contrast, the systems approach aims to examine large-scale interactions of many components simultaneously, on the premise that interactions in gene networks can be both linear and nonlinear. Both approaches are complementary, and they are becoming increasingly intertwined due to developments in gene editing tools, omics technologies, and population resources. Together, these strategies are beginning to drive the next era in complex trait research: paving the way to improve agriculture and towards more personalized medicine.

Keywords: Genetic Reference Populations (GRPs), G×E, Complex Traits, Gene Mapping, Genetically Modified Foods, GWAS, Gain-or-Loss of Function (G/LOF), Personalized Medicine, Reverse Genetics, Forward Genetics, Systems Genetics, Reductionism, Cross-Model, High-Dimensional Biology (HDP)

BACKGROUND

Over the past century, great strides have been made towards identifying and understanding the genetic and environmental factors affecting complex traits. This progress has come from divergent though complementary genetic techniques that address similar biological hypotheses from different angles: the study of humans and crops vs. model organisms, forward vs. reverse genetics, holistic vs. additive models, and so forth. Results from each of these approaches include the identification of thousands of variants influencing complex traits (Kingsmore et al., 2007) and the mechanistic delineation of hundreds of molecular pathways (Kanehisa et al., 2014). This knowledge has led to numerous practical benefits, ranging from the development of vaccines for transmissible diseases, to drug treatments, dietary, and lifestyle changes for metabolic diseases, to the rational modification of crops and livestock for agricultural needs. However, our ability to predict when, where, and how genetic, environmental, or gene-by-environment interactions (G×E) will lead to specific phenotypic outcomes is still limited, and few rationally-generated cures or preventative strategies are available for common diseases such as cancer and diabetes. In large part, this shortcoming stems from the challenge of identifying key genetic and environmental regulators that can be targeted by drugs, lifestyle changes, or genetic modifications. For decades, the identification of new targets and the implementation of treatments has been hampered by the conceptual complexity and diversity of genetic mechanisms. This obstacle is now being overcome by a combination of improvements in technical capability, new approaches in scientific thought, and increased resource sharing. With this in mind, how can we use these developments to find new medical and biological breakthroughs? To answer this, we must first consider the current state of research into complex traits, and how it has evolved over time (Figure 1A).

Figure 1.

Figure 1

Broad summary of concepts and developments in genetic analysis. (A) Relative timeline of complex trait analysis, listing some key landmarks and their approximate dates. (B) Left. The forward genetics approach relies on identifying divergent phenotypes then searching for the causative genetic factors, generally through QTL analysis or GWAS. Right. The reverse genetics approach relies on modifying a gene (or genes) of interest and then scanning for impact on downstream traits, often by G/LOF modifications. (C) Single genes can fully regulate a phenotype (e.g. sickle-cell trait), or they can play a role in more complex phenotypes for part of the population (e.g. PPARγ and metabolic disease (Deeb et al., 1998)). However, complex trait variation also comes from epistasis (e.g. mutations in p53 and other genes together are necessary to develop cancer (Soussi et al., 1994)). To fully explain heritability and complex trait variation for an entire population (e.g. diabetes), it is likely we will need the ability to model and test arbitrarily-complex interaction of dozens or more variants simultaneously. In this cartoon, examples are given where one, two, or many genes must be modified in order to change the observed phenotype (color—monogenic, spots—oligogenic, or size—polygenic).

DEVELOPMENTS IN COMPLEX TRAIT ANALYSIS

Traditionally, studies on complex traits could be separated into two distinct categories: those using forward or reverse genetics (Figure 1B). Both seek to answer fundamental and longstanding questions on how genes, the environment, and G×E factors influence complex traits. In forward genetics, a variable phenotype is measured and the upstream causal genetic variants are identified, while reverse genetics starts at the gene level and searches for the downstream phenotypic impact. Pioneering genetics studies relied on the forward genetics paradigm: natural populations and later, randomly mutagenized stocks were screened for variant phenotypes, then successively backcrossed to identify the causal locus. In the 1970s, research platforms began to shift towards reverse genetics as the first targeted mutagenesis techniques were developed, and by the 1990s this concept had come to the forefront in the study of complex traits. This capability of performing gain-or-loss of function (G/LOF) studies on target genes allowed the rational and mechanistic examination of genetic hypotheses gene-by-gene, and potentially even the reverse-engineering of complex traits. Consequently, many researchers began to advocate for—and generate—comprehensive genetic libraries of G/LOF tools in a variety of organisms, including yeast (Winzeler et al., 1999), Arabidopsis (Alonso et al., 2003), Drosophila (Ryder et al., 2007), C. elegans (Kamath et al., 2003; Rual et al., 2004), and mice (Auwerx et al., 2004; Skarnes et al., 2011). Unfortunately, the reality of using these resources to efficiently and comprehensively identify novel variants behind complex traits has been undermined by two major factors. First, we now know that much natural trait variation is driven by both the additive and non-additive interaction of dozens or more variants (Bogardus et al., 2002; Clark, 2000) (Figure 1C). Second, major genetic alterations, such as those typically induced in G/LOF studies, are a poor model for the common variants influencing trait variation in natural populations, which are generally more subtle (Chakravarti et al., 2013; MacArthur et al., 2012). Minor variants, gene × gene, and G×E interactions can be examined mechanistically using modern G/LOF tools, yet the exponential increase in the number of such possibilities as complexity expands necessitates the use of prior hypotheses instead of unbiased screens, particularly for vertebrate research. Finally, mechanisms that are uncovered in G/LOF models may not necessarily be generalizable to natural populations, whether in humans or in agriculture. These limitations of G/LOF models were recognized from the outset (Capecchi, 2005), but potential alternatives, particularly population genetics, suffered from strong deficits as well.

In parallel to the developments in forward and reverse genetics techniques, progress continued steadily on molecular measurement technologies that expanded the scope and depth of genetic analysis. The début of what has become the “omics revolution” began with massive investments in large-scale nucleotide sequencing (Smith and Hood, 1987), biological applications of mass spectrometry (Fenn et al., 1989; Wasinger et al., 1995), and array technology (Schena et al., 1995). By the late 1990s, the genomic and transcriptomic tools were sufficiently refined and affordable that small collaborative groups had the capability to generate and test hypotheses that required full pathway analysis by using comprehensive genomic and transcriptomic datasets. While the resulting and unprecedentedly-thorough datasets aided both population and G/LOF research, they particularly boosted the population approach. In theory, omics coverage could provide the capacity to identify causal gene networks wholesale through data-driven approaches—even directly in humans. Indeed, initial results using this approach to study common complex disorders were promising, as exemplified by the identification of variants in two genes, PPARγ (Deeb et al., 1998) and MC4R (Yeo et al., 1998), causal for metabolic disease. However, human population studies examining such genome–to–phenome links (e.g. genome-wide association studies (GWAS)) ran into several major barriers, among them the issues of linkage disequilibrium, commonly detected SNPs having small effect sizes, poor long-term environmental control, and the perennial issue of “missing heritability” (Goldstein, 2009; Lander, 2011). Furthermore, while genotype information is fairly consistent across time and tissue, the ephemeral nature of transcripts, proteins, and metabolites hindered detailed mechanistic analyses in human populations due to the difficulty or impossibility—depending on tissue—of obtaining biopsies. In retrospect, we have now seen that human GWAS led only to a slow trickle of discoveries between novel gene variants and complex traits (McCarthy et al., 2008).

In principal, some of these issues could be bypassed by analyzing diverse populations of model organisms, the generation and application of which are relatively similar cross-species (Flint and Mackay, 2009) (Figure 2A). Such populations fall broadly into two groups: those of genetically unique individuals such as F2s and outbreds, and those in specific and reproducible genetic reference populations (GRPs). However, the implementation of these concepts during the early years of high-throughput sequencing and microarray transcriptomics was hindered by small cohort sizes and the limited selection of ready-made GRPs (Williams et al., 2001). The earliest GRPs, such as the BXD mice (Taylor et al., 1973), had been developed and utilized decades previously, primarily for forward genetics analyses. However, such early studies often linked phenotypic variants to broad quantitative trait loci (QTLs) containing dozens to hundreds of candidate genes, thus the specific causative genes were rarely identified (Flint et al., 2005). This shortcoming could be largely solved by analyzing larger populations and by improving recombination density (Darvasi and Soller, 1995), yet the development of GRPs is expensive and can take many years to generate, particularly in vertebrates. Thus, little development was done on expanding model organism populations—even in invertebrates—until technological advances in omics and the shortcomings (and benefits) of human population research had been well-recognized. Since then, a diverse collection of GRPs have been developed across many model organisms which can simulate many aspects of the genetic complexity of natural populations (human, animal, or plant) but in controlled settings (Churchill et al., 2004; Kover et al., 2009; Mackay et al., 2012) (Table 1).

Figure 2.

Figure 2

(A) A summary and basic breeding schema for major population types. F2s are generated ad hoc, which can be studied or crossed and inbred for 20+ generations to generate recombinant inbred lines (RIL). F2s can be backcrossed to generate congenic, consomic, or conplastic strains. (B) Heritability should be considered as a function of gene–environment interactions, especially for complex traits influenced by environmental factors. Top Left: Across a diverse GRP, the observed trait is significantly elevated in “Environment B”. Bottom Left: Despite the difference in trait expression across environments, the heritability is similar when groups are segregated, though it plummets when the groups are combined. If the environmental difference can be controlled and groups separated, as here, the environment and G×E factors can be calculated by two-way ANOVA. Right: When the same data are displayed as strain averages, the heritability drop can be visualized. Within either environment, variability due to genotype (y-axis) is far lower than the error within a genotype (error bars). When these data are compressed without respect to environment, the cross-genotype variance compared to within-genotype variance decreases, hence the drop in observed heritability. (C) A broad overview of relative species particularities and strengths for the most common model organisms in population genetics. The suitability of any model also depends substantially on the question to be addressed and specific experimental design.

Table 1.

A list of landmark dates and papers in the generation of model populations, along with key populations or first populations in a variety of species. While the technical ability to generate most of these systems has existed for decades, it is only within the last ~15 years that they have gained prominence, due primarily to developments in omics technologies. All dates are approximate, given that it can take many years between initial population conceptualization and their first publication.

Date Resource Organism Reference
Wild Type (All)
~1866 F2 Intercross Pea Mendel
~1909 Inbred Lines Mouse CC Little
1959 Recombinant Inbred (CXB) Mouse (Bailey, 1981)
1971 Recombinant Inbred (BXD) Mouse (Taylor et al., 1973)
1982 Recombinant Inbred (BXH/HXB) Rat (Pravenec et al., 1989)
1984 Diversity Panel (N/Nih) Rat (Li and Lumeng, 1984)
1988 Recombinant Inbred (WSxM13) Corn (Burr et al., 1988)
1996 Recombinant Inbred (WSxM13) Arabidopsis (Liu et al., 1996)
1997 Recombinant Inbred (Rx2b) Drosophila (Nuzhdin et al., 1997)
2000 Chromosome Substitution Mice (Nadeau et al., 2000)
2001 Recombinant Inbred (BOxRC301) Caenorhabditis (Ayyadevara et al., 2001)
2002 Recombinant Inbred Soybean (Yuan et al., 2002)
2003 Recombinant Inbred Barley (Arru et al., 2003)
2004 Advanced Recombinant Inbred (BXD) Mouse (Peirce et al., 2004)
2004 Advanced Recombinant Inbred (LXS) Mouse (Williams et al., 2004)
2004 Advanced Recombinant Inbred (Collaborative Cross) Mouse (Churchill et al., 2004)
2005 Diversity Panel (Heterogeneous stock) Caenorhabditis (Sivasundar and Hey, 2005)
2006 Diversity Panel (Heterogeneous stock) Mouse (Valdar et al., 2006)
2008 Diversity Panel (1001 Genomes Project) Arabidopsis (Ossowski et al., 2008)
2009 Advanced Recombinant Inbred (MAGIC) Arabidopsis (Kover et al., 2009)
2009 Advanced Recombinant Inbred (N2xCB4856) Caenorhabditis (Rockman and Kruglyak, 2009)
2010 Chromosome Substitution Rice (Xu et al., 2010)
2010 Diversity Panel (Hybrid Mouse Diversity Panel) Mouse (Bennett et al., 2010)
2011 Recombinant Inbred (4-way cross) Yeast (Cubillos et al., 2011)
2012 Diversity Panel (DGRP) Drosophila (Mackay et al., 2012)
2012 Advanced Recombinant Inbred (DSPR) Drosophila (King et al., 2012)

In addition to solving some of the key issues of human cohort studies, including sample collection and long-term environmental control, model populations can also be used to address longstanding genetics questions. For instance, complex traits that are strongly influenced by G×E factors may lead to incorrect calculations stating that these traits are not highly heritable, when in fact the heritability may be influenced by genes modulated by variable, but unknown, environmental factors (Figure 2B). When selecting which model organism (or model organisms) to examine, many tradeoffs must also be considered, including (1) speed, expense, efficiency, and technical concerns such as recombination events and self-fertilization or inbreeding depression (2) ability to perform tissue and time-specific molecular phenotyping, (3) availability of gene editing tools/libraries, (4) availability of population resources, (5) similarity to humans, agriculture, or livestock, and (6) the resource community surrounding the model (Figure 2C). Additionally, certain model organisms can provide unique capacity for study designs that would be infeasible or even impossible in other species. For example, one recent study exploited genetically-diverse yeast to create sets of millions of individuals which in turn display gene expression from very low (akin to “knockouts”), to average (akin to “wildtype”), to very high (akin to “transgenic overexpression”)—and everywhere in between (Albert et al., 2014). In such extensive populations, nearly all genes display very strong levels of variability. Moreover, the power to detect QTLs depends largely on the variance of the traits of interest. Thus, this approach (dubbed extreme QTL, or “X-QTL”, mapping) increases both the scale and scope of findings as compared using smaller cohorts. However, two reasons limit the generation of X-QTLs in more complex organisms. The first is due to scale, as analyzing millions of individuals remains logistically and economically prohibitive in most organisms, and the second due to technique, as the expression of any gene can be examined visually in yeast (and C. elegans) by tagging target genes with green fluorescent protein (GFP). As such, gene expression may be assessed across millions of individuals visually, groups may be separated into a few clusters based on expression, and omics analysis can be performed only on these subsets. For more complex organisms, such as mice, omics analyses would be necessary for all millions of individuals to capture the full spectrum of variance.

Model populations can also address many of the shortcomings of studies using G/LOF models or single inbred lines. For instance, early drug trials in model organisms are predominantly performed as experiments with a genotypic n = 1. The reality of complex trait genetics, however, means that even a reproducible finding in one individual may not apply in another. For example, had researchers exclusively used the DBA/2J mouse strain in place of the prototypical C57BL/6J, they would have concluded that morphine is a non-addictive and inefficient painkiller (Elmer et al., 2010). Examples like this have led to a recognition that genetic diversity must be considered at all steps from target identification to final outcome, an approach that is beginning to be implemented medically (Barretina et al., 2012), and which is starting to assist in the design of compounds which are highly effective in particular, defined subsets of patients (e.g. CFTR variant-specific drug treatments for cystic fibrosis (Ledford, 2012)). However, simplified models remain essential: early drug screens of dozens of novel compounds to treat an illness cannot be efficiently performed if they are each thoroughly tested in extensive GRPs. These tradeoffs between comprehensive genetic models (which may be excessively complex) and simplified models (which may be oversimplified) must be considered at each stage of the research process. This realization is beginning to lead to the complementary implementation of these approaches throughout the pipeline of complex trait analysis.

COMPLEMENTARY APPROACHES IN COMPLEX TRAIT GENETICS

The parallel expansion of gene editing capabilities, omics technologies, and population resources has created the capability to implement experiments incorporating mechanistic in vitro experiments, comprehensive studies in G/LOF organisms and model populations, and research in natural populations. This complementarity is multidirectional. In one way, in vitro mechanistic studies can provide detailed hypotheses for in vivo verification using reductionist models, then tested in diverse genetic backgrounds for potential interaction effects of the variant, and finally examined outside the laboratory setting. Conversely, population studies—often beset by false discovery—may use G/LOF models to distinguish several potential causal factors or to obtain detailed mechanistic understanding. In general, study design can also be streamlined by considering complementary cross-species techniques. For instance, the de novo identification of gene–phenotype links can be readily established using unbiased screens in invertebrates, thus providing specific hypotheses for G/LOF work in mammals—and eventually for the clinic (Figure 3A). Population studies as well may benefit from cross-species analysis—from GRP to GWAS or vice versa—such as in assisting the identification of causal genes under a QTL or within linkage disequilibrium (Figure 3B). Finally, it is not fundamentally necessary for the target genes, or even mechanisms, to be conserved across species to derive utility from a cross-species or cross-model approach. Discovery of unique aspects of an organism can be as informative as the commonalities, such as the key differences in atherosclerosis development between mice and humans (Jiang et al., 1992). Furthermore, particularly for agriculture, the identification of species-specific genes and mechanisms can also assist in transgenic development of crops, which are resistant to particular diseases or environmental conditions (Figure 3C).

Figure 3.

Figure 3

A summary of a few study designs using cross-model and/or cross-species approaches. (A) Conceptual schematic of G/LOF screenings performed in simpler model organisms (e.g. Drosophila) which may be used to generate targeted hypotheses for G/LOF studies in more complex (e.g. mammalian) models. These results may then provide targeted hypotheses for validation in human GWAS. (B) Cross-model example of population studies benefitting from one-another. QTL results do not indicate a specific gene, and equivalent GWAS results, displayed as a Manhattan plot, would not approach significance on a genome-wide scan. Furthermore, candidate SNPs in GWAS may be in linkage disequilibrium. The complementarity of independent population studies can often be applied across species by comparing syntenic regions. (C) Conceptual approach of gene transfer. Natural genetic pathways allowing certain plants to resist (e.g.) drought or disease may potentially be translated into other species via transgenesis.

The power of these combined approaches to complex trait analysis can be illustrated by a few recent studies. For instance, population research in any organism can be used to identify gene–phenotype relationships en masse. However, while only a few gene candidates may be practically assessed in vertebrate G/LOF studies, hundreds of genes can be analyzed in yeast, worms, or flies (related to Figure 3A). In one specific recent example, a study screened and phenotyped 11,594 transgenic lines of Drosophila for adiposity-related phenotypes (Pospisilik et al., 2010). For the ~500 positive hits, the authors performed mechanistic analyses to define the links between each candidate and adiposity, and to inform upon the strongest candidates for G/LOF experiments in mice. Subsequently, fat-specific knockout mice were generated for the top candidate gene Sufu, which displayed robust changes in fat mass. Such large screens can also be used to inform directly upon human GWAS, as shown in a separate recent example comparing results from a genetic loss of function screen in Drosophila to rare diseases in human exome data (Yamamoto et al., 2014). The complementarity of these cross-species exchanges can be quite surprising: for instance, genes linked to neurological diseases in humans can be effectively analyzed in reductive plant models (Xu and Møller, 2011).

In other cases, both hypothesis generation and validation may be performed using population studies, including for humans and mice (related to Figure 3B). For instance, a recent extensive metabolic study of the BXD mouse GRP identified more than a dozen novel and significant phenotype QTLs (Andreux et al., 2012). The QTL for systolic blood pressure, with five candidate genes, was prioritized for validation due to the small number of possibilities and, moreover, the ready availability of independent population studies which measured blood pressure (Koutnikova et al., 2009). In the subsequent analysis of three independent human GWAS, SNPs in a single gene, Ubp1, were associated consistently and significantly with elevated blood pressure. Although these SNPs were nominally significant in the original data, the correction for multiple testing across hundreds of thousands of SNPs across the whole genome drowned out the signal. Details about cellular mechanisms can also be confidently attributed solely using population data. In a separate BXD study, it was observed that diabetic mice have significantly lower levels of the metabolite 2-aminoadipate, independently from the effects of low or high fat diet feeding (Wu et al., 2014). In turn, the levels of this metabolite mapped significantly to the gene locus containing Dhtkd1, a known enzyme in 2-aminoadipate metabolism (Danhauser et al., 2012). Thus, a mechanistic link was identified between Dhtkd1 and diabetes via the regulation of 2-aminoadipate. Furthermore, two independent populations (one mouse, one human) were examined using this hypothesis, and again the same links with diabetes were observed. As before, multiple testing correction had prevented the identification of this connection in the original datasets (Wu et al., 2014).

Finally, human medical application is not the only goal: diseases uniquely affecting animals (e.g. rinderpest, foot-and-mouth disease) and plants (e.g. coffee rust, black rot) are equally important. Additionally, genetic engineering in agriculture in response to environmental needs or changes (e.g. drought resistance, improved yield) is a critical research goal as we must contend with expanding populations and general climatic variation. For instance, gene–phenotype links identified in Arabidopsis—itself not a crop—are often applied to agriculture. Mutagenesis screens in Arabidopsis have identified genetic variants influencing flowering promotion (e.g. SFT) and flowering repression (e.g. SP), which can be subsequently applied cross-species (related to Figure 3C). In tomato, the orthologs of these genes were rationally modified, leading to a striking 130% variation in tomato yield between the suppressed and promoted plants (Park et al., 2014). Similar cross-species plant studies have improved cold tolerance in tobacco (Zhao et al., 2009), drought resistance in rice (Datta et al., 2012), and salt resistance in barley (Schilling et al., 2014). Research for agricultural genetics is often parallel to medical genetics, yet this divergence is not fundamental, and the two can benefit equally from sharing technologies, methodologies, and findings.

BIG DATA

Moving across species, and between or among populations and G/LOF models, can provide a greater understanding of biological mechanisms underlying complex traits. However, the in vivo application of such resources still requires extensive experience and potentially time and expense. To this end, it is essential to consider and exploit the wealth of results available in public “big data” resources, which can allow hypothesis examination in silico. Over the last ten years, the generation of full genomic and transcriptomic datasets has become commonplace. The full DNA sequence of more than 100 species on the UCSC Genome Browser (Kent et al., 2002), more than 1.2 million complete transcript datasets on the Gene Expression Omnibus (GEO) (Barrett et al., 2013), and dozens of other smaller repositories have sprung up as well, often organized per-species or per-topic (Bastian et al., 2008; Chesler et al., 2004; Harris et al., 2010). Conjointly, a wide array of software and web resources have been developed to aid in the analysis and interpretation of large systems datasets (Barrett et al., 2013; Subramanian et al., 2005; Wang et al., 2003). These resources can provide secondary study validation, study power, founding hypotheses, and inform decisions on study design. These possibilities are rapidly expanding, as studies increasingly generate more data than can be analyzed under the scope of a single program. When made public, this “excess” information adds to a treasure trove for secondary analysis and independent validation. Consequently, primary research projects are increasingly taking advantage of independent research beyond the introduction and verbatim citations—whether for hypothesis generation, validation, or complete meta-analysis (Khan et al., 2014; Shin et al., 2014).

Historical data can also grow in value over time as additional, diverse datasets are collated. For example, early QTL studies often identified strong gene–phenotype links, but were unable to establish the causal genes. These “unfinished stories” now serve as pre-made hypotheses, such as for detailed longevity data recorded in the BXD GRP well before the common implementation of omics technologies (De Haan and Van Zant, 1999; Gelman et al., 1988). These two early studies observed a common, significant QTL on chromosome 2 that contributed about 30% to the overall variation in lifespan, but the causal gene(s) were not defined, and the QTL went unproven. More than a decade later, using modern genetic maps, transcriptomic data, and information on sequence variants, specific candidate genes were found under the QTL. Subsequently, two of these genes (mrps-5 and nkcc-1) were validated in C. elegans using G/LOF technologies that, again, did not exist at the time of the initial study—even if candidate genes had been identified (Houtkooper et al., 2013). Further mechanistic studies on mrps-5 in C. elegans and in an independent set of BXDs were used to uncover that the longevity effects stemmed from the dysregulation of mitochondrial protein translation. The resulting stoichiometric imbalance between mitochondrial and nuclear encoded proteins, dubbed mitonuclear imbalance, induces the mitochondrial unfolded protein response (known as UPRmt), a known adaptive stress response protecting against aging (Jovaisaite et al., 2014). In other cases, rather than taking old datasets piecemeal, results may also be taken together and collated. Decades of detailed electronic medical records at hospitals now serve as the backbone for a recent approach called phenome wide association studies (PheWAS)—essentially, the reverse genetics analog of GWAS (Denny et al., 2013). In GWAS, a phenotype is queried against genotypes to determine the causative gene(s), while in PheWAS, a gene or SNP of interest is queried against phenotypes to determine associated traits. This is conceptually straightforward, yet to be effective this approach requires extensive and comprehensive sets of phenotype data—a far more expensive and time-consuming task to generate than for genotype data. Some successes have already been reported with this approach, e.g. FTO—traditionally an obesity gene—was recently linked to fibrocystic breast disease by PheWAS (Cronin et al., 2014).

The increasing richness of omics data from both populations and G/LOF models also allows meta-analytical studies to detect and validate novel findings completely in silico (Horvath, 2013; Lee et al., 2012). In one recent high-profile example of meta-analysis, thousands of human methylation arrays spanning 51 tissues were cross-referenced with a single basic phenotype—the age of the patient (Horvath, 2013). A statistical model was then developed to accurately calculate the ages of the patients using only tissue samples and a set of DNA methylation sites. Using this model, it was observed that cancers have “older” methylation patterns than healthy cells; for instance, calculating the age of a 30-year old breast cancer patient using a blood sample would yield an age of 27–33 years, while the same calculation using a breast tumor biopsy would yield an age of ~75 years. The confidence in the study results was made possible by the wealth of independently generated methylation datasets: half of the studies were used to generate hypotheses, and the other half to test hypotheses, thus providing experimental validation without a need for “new” data. The crossover and “re-use” of data is not new, yet the viability of this approach has rapidly improved due to the remarkable increase in the quality and scope of omics datasets. However, the informatic difficulty in large-scale manipulation of this approach has thus far largely restricted its application to computational biology (Marx, 2013). Correspondingly, there has been a disconnect between bioinformatics groups working in silico and generating non-validated networks of profound complexity, and traditional in vivo and in vitro biologists elucidating biochemical processes that may be only applicable in the specific circumstances of a single experiment. As with the converging applications of population genetics and G/LOF models, computational and wet lab biologists are now beginning to find common ground; bioinformatics platforms are becoming more widely established both institutionally and imbedded in individual wet labs.

FUTURE PERSPECTIVES

The recent fundamental shifts we have discussed are beginning to drive the next era in complex trait analysis, yet it is only the next step of many towards in our push to understand genetics. We have mastered the ability to model and analyze monogenic gene to phenotype interactions, both through their identification by forward genetics and by using precise reverse genetics modifications to elucidate their mechanisms. Now, we are developing the capabilities to comprehensively identify and understand more expansive oligogenic gene to phenotype and G×E to phenotype interactions (see Figure 1C). Despite these advances, a complete holistic understanding of systems biology is still beset by a number of challenges ahead: (1) there remains a need for higher quality and more dynamic omics measurements; (2) epistatic gene modeling requires exponentially-increasing sample sizes; (3) even the best statistical assurances of causality do not preclude a necessity for experimental validation, at least so far; and (4) there is no “ideal” experimental setup for the study of complex traits. For the first challenge, the next stage is already in sight, with single cell dynamic profiling (Cai et al., 2006), quantitative, unbiased metabolite analysis (Fuhrer et al., 2011), larger-scale untargeted SWATH proteomics (Rost et al., 2014), and other such approaches breaking new ground. To the second point, model populations are becoming larger and more refined (Churchill et al., 2004) and G/LOF tools are becoming increasingly multiplexed (Chen et al., 2015; Sander and Joung, 2014)—together allowing identification and validation of moderate-scale epistatic interactions. The third obstacle may perhaps be addressed in time, but there remains little indication that accurate mathematical models are on the horizon, except for a few specific biomolecular processes such as protein folding. The last point is more fundamental: a new experiment will always be necessary for testing a new drug or treatment, regardless of the improvements in predictive models or genomics resources.

Finally, further changes in mindset are required. As hypotheses become larger and more complex, data and experimental methods must be more readily communicated within and outside of the standard publication cycle. In academic research, this need is now regularly accepted, yet a great deal of data and methods are not made available with publications, either due to privacy concerns (e.g. human data), a lack of a standard database (e.g. raw mass spectrometry data), or even active obfuscation (e.g. in patents, agricultural (Zamir, 2014), or pharmaceutical (Prayle et al., 2012) research). However, given the tremendous progress in resource sharing within the last decade, we expect many of these concerns to diminish in the coming years. There is a revolution in complex trait analysis well underway, and together the combined applications of systems biology and reductionist approaches are beginning to turn the trickle of genetic discoveries into a steady stream that will usher in a new era for personalized medicine and for environmentally-safe genetically modified crops.

Acknowledgments

We thank Dr. Xu Wang and Dr. Robert W. Williams for discussions on systems biology and cross-species approaches. JA is the Nestlé Chair in Energy Metabolism and the J.A. laboratory is supported by grants from the École Polytechnique Fédérale de Lausanne, the Swiss National Science Foundation (31003A-140780), the Sinergia grant mechanism (CSRII3-136201), the AgingX program of the Swiss Initiative for Systems Biology (51RTP0-151019), and the NIH (R01AG043930).

Footnotes

DISCLOSURES

No conflicts of interest, financial or otherwise, are declared by the authors.

AUTHOR CONTRIBUTIONS

EW and JA outlined and designed the manuscript. EW drafted the manuscript and prepared figures. JA approved and edited the manuscript.

References

  1. Albert FW, Treusch S, Shockley AH, Bloom JS, Kruglyak L. Genetics of single-cell protein abundance variation in large yeast populations. Nature. 2014 doi: 10.1038/nature12904. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Alonso JM, Stepanova AN, Leisse TJ, Kim CJ, Chen H, Shinn P, Stevenson DK, Zimmerman J, Barajas P, Cheuk R, et al. Genome-wide insertional mutagenesis of Arabidopsis thaliana. Science. 2003;301:653–657. doi: 10.1126/science.1086391. [DOI] [PubMed] [Google Scholar]
  3. Andreux PA, Williams EG, Koutnikova H, Houtkooper RH, Champy MF, Henry H, Schoonjans K, Williams RW, Auwerx J. Systems genetics of metabolism: the use of the BXD murine reference panel for multiscalar integration of traits. Cell. 2012;150:1287–1299. doi: 10.1016/j.cell.2012.08.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Arru L, Francia E, Pecchioni N. Isolate-specific QTLs of resistance to leaf stripe (Pyrenophora graminea) in the ‘Steptoe’ x ‘Morex’ spring barley cross. Theor Appl Genet. 2003;106:668–675. doi: 10.1007/s00122-002-1115-x. [DOI] [PubMed] [Google Scholar]
  5. Auwerx J, Avner P, Baldock R, Ballabio A, Balling R, Barbacid M, Berns A, Bradley A, Brown S, Carmeliet P, et al. The European dimension for the mouse genome mutagenesis program. Nature genetics. 2004;36:925–927. doi: 10.1038/ng0904-925. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Ayyadevara S, Ayyadevara R, Hou S, Thaden JJ, Shmookler Reis RJ. Genetic mapping of quantitative trait loci governing longevity of Caenorhabditis elegans in recombinant-inbred progeny of a Bergerac-BO x RC301 interstrain cross. Genetics. 2001;157:655–666. doi: 10.1093/genetics/157.2.655. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bailey DW. Recombinant inbred strains and bilineal congenic strains. New York: Academic Press; 1981. [Google Scholar]
  8. Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ, Lehar J, Kryukov GV, Sonkin D, et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483:603–607. doi: 10.1038/nature11003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, et al. NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids Res. 2013;41:D991–995. doi: 10.1093/nar/gks1193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bastian F, Parmentier G, Roux J, Moretti S, Laudet V, Robinson-Rechavi M. Bgee: Integrating and comparing heterogeneous transcriptome data among species. Lect N Bioinformat. 2008;5109:124–131. [Google Scholar]
  11. Bennett BJ, Farber CR, Orozco L, Kang HM, Ghazalpour A, Siemers N, Neubauer M, Neuhaus I, Yordanova R, Guan B, et al. A high-resolution association mapping panel for the dissection of complex traits in mice. Genome research. 2010;20:281–290. doi: 10.1101/gr.099234.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Bogardus C, Baier L, Permana P, Prochazka M, Wolford J, Hanson R. Identification of susceptibility genes for complex metabolic diseases. Annals of the New York Academy of Sciences. 2002;967:1–6. doi: 10.1111/j.1749-6632.2002.tb04257.x. [DOI] [PubMed] [Google Scholar]
  13. Burr B, Burr FA, Thompson KH, Albertson MC, Stuber CW. Gene mapping with recombinant inbreds in maize. Genetics. 1988;118:519–526. doi: 10.1093/genetics/118.3.519. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Cai L, Friedman N, Xie XS. Stochastic protein expression in individual cells at the single molecule level. Nature. 2006;440:358–362. doi: 10.1038/nature04599. [DOI] [PubMed] [Google Scholar]
  15. Capecchi MR. Gene targeting in mice: functional analysis of the mammalian genome for the twenty-first century. Nature reviews Genetics. 2005;6:507–512. doi: 10.1038/nrg1619. [DOI] [PubMed] [Google Scholar]
  16. Chakravarti A, Clark AG, Mootha VK. Distilling pathophysiology from complex disease genetics. Cell. 2013;155:21–26. doi: 10.1016/j.cell.2013.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Chen S, Sanjana NE, Zheng K, Shalem O, Lee K, Shi X, Scott DA, Song J, Pan JQ, Weissleder R, et al. Genome-wide CRISPR Screen in a Mouse Model of Tumor Growth and Metastasis. Cell. 2015;160:1246–1260. doi: 10.1016/j.cell.2015.02.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Chesler EJ, Lu L, Wang J, Williams RW, Manly KF. WebQTL: rapid exploratory analysis of gene expression and genetic networks for brain and behavior. Nat Neurosci. 2004;7:485–486. doi: 10.1038/nn0504-485. [DOI] [PubMed] [Google Scholar]
  19. Churchill GA, Airey DC, Allayee H, Angel JM, Attie AD, Beatty J, Beavis WD, Belknap JK, Bennett B, Berrettini W, et al. The Collaborative Cross, a community resource for the genetic analysis of complex traits. Nature genetics. 2004;36:1133–1137. doi: 10.1038/ng1104-1133. [DOI] [PubMed] [Google Scholar]
  20. Clark AG. Limits to prediction of phenotypes from knowledge of genotypes. Evol Biol. 2000;32:205–224. [Google Scholar]
  21. Cronin RM, Field JR, Bradford Y, Shaffer CM, Carroll RJ, Mosley JD, Bastarache L, Edwards TL, Hebbring SJ, Lin S, et al. Phenome-wide association studies demonstrating pleiotropy of genetic variants within FTO with and without adjustment for body mass index. Frontiers in genetics. 2014;5:250. doi: 10.3389/fgene.2014.00250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Cubillos FA, Billi E, Zorgo E, Parts L, Fargier P, Omholt S, Blomberg A, Warringer J, Louis EJ, Liti G. Assessing the complex architecture of polygenic traits in diverged yeast populations. Molecular ecology. 2011;20:1401–1413. doi: 10.1111/j.1365-294X.2011.05005.x. [DOI] [PubMed] [Google Scholar]
  23. Danhauser K, Sauer SW, Haack TB, Wieland T, Staufner C, Graf E, Zschocke J, Strom TM, Traub T, Okun JG, et al. DHTKD1 mutations cause 2-aminoadipic and 2-oxoadipic aciduria. American journal of human genetics. 2012;91:1082–1087. doi: 10.1016/j.ajhg.2012.10.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Darvasi A, Soller M. Advanced intercross lines, an experimental population for fine genetic mapping. Genetics. 1995;141:1199–1207. doi: 10.1093/genetics/141.3.1199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Datta K, Baisakh N, Ganguly M, Krishnan S, Yamaguchi Shinozaki K, Datta SK. Overexpression of Arabidopsis and rice stress genes’ inducible transcription factor confers drought and salinity tolerance to rice. Plant biotechnology journal. 2012;10:579–586. doi: 10.1111/j.1467-7652.2012.00688.x. [DOI] [PubMed] [Google Scholar]
  26. De Haan G, Van Zant G. Genetic analysis of hematopoietic cell cycling in mice suggests its involvement in organismal life span. FASEB journal : official publication of the Federation of American Societies for Experimental Biology. 1999;13:707–713. doi: 10.1096/fasebj.13.6.707. [DOI] [PubMed] [Google Scholar]
  27. Deeb SS, Fajas L, Nemoto M, Pihlajamaki J, Mykkanen L, Kuusisto J, Laakso M, Fujimoto W, Auwerx J. A Pro12Ala substitution in PPARgamma2 associated with decreased receptor activity, lower body mass index and improved insulin sensitivity. Nature genetics. 1998;20:284–287. doi: 10.1038/3099. [DOI] [PubMed] [Google Scholar]
  28. Denny JC, Bastarache L, Ritchie MD, Carroll RJ, Zink R, Mosley JD, Field JR, Pulley JM, Ramirez AH, Bowton E, et al. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nature biotechnology. 2013;31:1102–1110. doi: 10.1038/nbt.2749. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Elmer GI, Pieper JO, Hamilton LR, Wise RA. Qualitative differences between C57BL/6J and DBA/2J mice in morphine potentiation of brain stimulation reward and intravenous self-administration. Psychopharmacology. 2010;208:309–321. doi: 10.1007/s00213-009-1732-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Fenn JB, Mann M, Meng CK, Wong SF, Whitehouse CM. Electrospray ionization for mass spectrometry of large biomolecules. Science. 1989;246:64–71. doi: 10.1126/science.2675315. [DOI] [PubMed] [Google Scholar]
  31. Flint J, Mackay TF. Genetic architecture of quantitative traits in mice, flies, and humans. Genome research. 2009;19:723–733. doi: 10.1101/gr.086660.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Flint J, Valdar W, Shifman S, Mott R. Strategies for mapping and cloning quantitative trait genes in rodents. Nature reviews Genetics. 2005;6:271–286. doi: 10.1038/nrg1576. [DOI] [PubMed] [Google Scholar]
  33. Fuhrer T, Heer D, Begemann B, Zamboni N. High-throughput, accurate mass metabolome profiling of cellular extracts by flow injection-time-of-flight mass spectrometry. Analytical chemistry. 2011;83:7074–7080. doi: 10.1021/ac201267k. [DOI] [PubMed] [Google Scholar]
  34. Gelman R, Watson A, Bronson R, Yunis E. Murine chromosomal regions correlated with longevity. Genetics. 1988;118:693–704. doi: 10.1093/genetics/118.4.693. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Goldstein DB. Common genetic variation and human traits. N Engl J Med. 2009;360:1696–1698. doi: 10.1056/NEJMp0806284. [DOI] [PubMed] [Google Scholar]
  36. Harris TW, Antoshechkin I, Bieri T, Blasiar D, Chan J, Chen WJ, De La Cruz N, Davis P, Duesbury M, Fang R, et al. WormBase: a comprehensive resource for nematode research. Nucleic Acids Res. 2010;38:D463–467. doi: 10.1093/nar/gkp952. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Horvath S. DNA methylation age of human tissues and cell types. Genome Biol. 2013;14:R115. doi: 10.1186/gb-2013-14-10-r115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Houtkooper RH, Mouchiroud L, Ryu D, Moullan N, Katsyuba E, Knott G, Williams RW, Auwerx J. Mitonuclear protein imbalance as a conserved longevity mechanism. Nature. 2013;497:451–457. doi: 10.1038/nature12188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Jiang XC, Agellon LB, Walsh A, Breslow JL, Tall A. Dietary cholesterol increases transcription of the human cholesteryl ester transfer protein gene in transgenic mice. Dependence on natural flanking sequences. The Journal of clinical investigation. 1992;90:1290–1295. doi: 10.1172/JCI115993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Jovaisaite V, Mouchiroud L, Auwerx J. The mitochondrial unfolded protein response, a conserved stress response pathway with implications in health and disease. The Journal of experimental biology. 2014;217:137–143. doi: 10.1242/jeb.090738. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Kamath R, Fraser A, Dong Y, Poulin G, Durbin R, Gotta M, Kanapin A, Le Bot N, Moreno S, Sohrmann M, et al. Systematic functional analysis of the Caenorhaditis elegans genome using RNAi. Nature. 2003;421:231–237. doi: 10.1038/nature01278. [DOI] [PubMed] [Google Scholar]
  42. Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Research. 2014;42:D199–D205. doi: 10.1093/nar/gkt1076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D. The human genome browser at UCSC. Genome research. 2002;12:996–1006. doi: 10.1101/gr.229102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Khan D, Chan A, Millar JL, Girard IJ, Belmonte MF. Predicting transcriptional circuitry underlying seed coat development. Plant science : an international journal of experimental plant biology. 2014;223:146–152. doi: 10.1016/j.plantsci.2014.03.016. [DOI] [PubMed] [Google Scholar]
  45. King EG, Macdonald SJ, Long AD. Properties and power of the Drosophila Synthetic Population Resource for the routine dissection of complex traits. Genetics. 2012;191:935–949. doi: 10.1534/genetics.112.138537. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Kingsmore SF, Lindquist IE, Mudge J, Beavis WD. Genome-wide association studies: progress in identifying genetic biomarkers in common, complex diseases. Biomarker insights. 2007;2:283–292. [PMC free article] [PubMed] [Google Scholar]
  47. Koutnikova H, Laakso M, Lu L, Combe R, Paananen J, Kuulasmaa T, Kuusisto J, Haring HU, Hansen T, Pedersen O, et al. Identification of the UBP1 locus as a critical blood pressure determinant using a combination of mouse and human genetics. PLoS Genet. 2009;5:e1000591. doi: 10.1371/journal.pgen.1000591. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Kover PX, Valdar W, Trakalo J, Scarcelli N, Ehrenreich IM, Purugganan MD, Durrant C, Mott R. A Multiparent Advanced Generation Inter-Cross to fine-map quantitative traits in Arabidopsis thaliana. PLoS Genet. 2009;5:e1000551. doi: 10.1371/journal.pgen.1000551. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Lander ES. Initial impact of the sequencing of the human genome. Nature. 2011;470:187–197. doi: 10.1038/nature09792. [DOI] [PubMed] [Google Scholar]
  50. Ledford H. Drug bests cystic-fibrosis mutation. Nature. 2012;482:145. doi: 10.1038/482145a. [DOI] [PubMed] [Google Scholar]
  51. Lee PN, Forey BA, Coombs KJ. Systematic review with meta-analysis of the epidemiological evidence in the 1900s relating smoking to lung cancer. BMC cancer. 2012;12:385. doi: 10.1186/1471-2407-12-385. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Li TK, Lumeng L. Alcohol preference and voluntary alcohol intakes of inbred rat strains and the National Institutes of Health heterogeneous stock of rats. Alcoholism, clinical and experimental research. 1984;8:485–486. doi: 10.1111/j.1530-0277.1984.tb05708.x. [DOI] [PubMed] [Google Scholar]
  53. Liu SC, Kowalski SP, Lan TH, Feldmann KA, Paterson AH. Genome-wide high-resolution mapping by recurrent intermating using Arabidopsis thaliana as a model. Genetics. 1996;142:247–258. doi: 10.1093/genetics/142.1.247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. MacArthur DG, Balasubramanian S, Frankish A, Huang N, Morris J, Walter K, Jostins L, Habegger L, Pickrell JK, Montgomery SB, et al. A systematic survey of loss-of-function variants in human protein-coding genes. Science. 2012;335:823–828. doi: 10.1126/science.1215040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Mackay TF, Richards S, Stone EA, Barbadilla A, Ayroles JF, Zhu D, Casillas S, Han Y, Magwire MM, Cridland JM, et al. The Drosophila melanogaster Genetic Reference Panel. Nature. 2012;482:173–178. doi: 10.1038/nature10811. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Marx V. Biology: The big challenges of big data. Nature. 2013;498:255–260. doi: 10.1038/498255a. [DOI] [PubMed] [Google Scholar]
  57. McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP, Hirschhorn JN. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature reviews Genetics. 2008;9:356–369. doi: 10.1038/nrg2344. [DOI] [PubMed] [Google Scholar]
  58. Nadeau JH, Singer JB, Matin A, Lander ES. Analysing complex genetic traits with chromosome substitution strains. Nature genetics. 2000;24:221–225. doi: 10.1038/73427. [DOI] [PubMed] [Google Scholar]
  59. Nuzhdin SV, Pasyukova EG, Dilda CL, Zeng ZB, Mackay TF. Sex-specific quantitative trait loci affecting longevity in Drosophila melanogaster. Proceedings of the National Academy of Sciences of the United States of America. 1997;94:9734–9739. doi: 10.1073/pnas.94.18.9734. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Ossowski S, Schneeberger K, Clark RM, Lanz C, Warthmann N, Weigel D. Sequencing of natural strains of Arabidopsis thaliana with short reads. Genome research. 2008;18:2024–2033. doi: 10.1101/gr.080200.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Park SJ, Jiang K, Tal L, Yichie Y, Gar O, Zamir D, Eshed Y, Lippman ZB. Optimization of crop productivity in tomato using induced mutations in the florigen pathway. Nature genetics. 2014;46:1337–1342. doi: 10.1038/ng.3131. [DOI] [PubMed] [Google Scholar]
  62. Peirce JL, Lu L, Gu J, Silver LM, Williams RW. A new set of BXD recombinant inbred lines from advanced intercross populations in mice. BMC Genet. 2004;5:7. doi: 10.1186/1471-2156-5-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Pospisilik JA, Schramek D, Schnidar H, Cronin SJ, Nehme NT, Zhang X, Knauf C, Cani PD, Aumayr K, Todoric J, et al. Drosophila genome-wide obesity screen reveals hedgehog as a determinant of brown versus white adipose cell fate. Cell. 2010;140:148–160. doi: 10.1016/j.cell.2009.12.027. [DOI] [PubMed] [Google Scholar]
  64. Pravenec M, Klir P, Kren V, Zicha J, Kunes J. An analysis of spontaneous hypertension in spontaneously hypertensive rats by means of new recombinant inbred strains. Journal of hypertension. 1989;7:217–221. [PubMed] [Google Scholar]
  65. Prayle AP, Hurley MN, Smyth AR. Compliance with mandatory reporting of clinical trial results on ClinicalTrials.gov: cross sectional study. Bmj. 2012;344:d7373. doi: 10.1136/bmj.d7373. [DOI] [PubMed] [Google Scholar]
  66. Rockman MV, Kruglyak L. Recombinational landscape and population genomics of Caenorhabditis elegans. PLoS Genet. 2009;5:e1000419. doi: 10.1371/journal.pgen.1000419. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Rost HL, Rosenberger G, Navarro P, Gillet L, Miladinovic SM, Schubert OT, Wolski W, Collins BC, Malmstrom J, Malmstrom L, et al. OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nature biotechnology. 2014;32:219–223. doi: 10.1038/nbt.2841. [DOI] [PubMed] [Google Scholar]
  68. Rual JF, Ceron J, Koreth J, Hao T, Nicot AS, Hirozane-Kishikawa T, Vandenhaute J, Orkin SH, Hill DE, van den Heuvel S, et al. Toward improving Caenorhabditis elegans phenome mapping with an ORFeome-based RNAi library. Genome research. 2004;14:2162–2168. doi: 10.1101/gr.2505604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Ryder E, Ashburner M, Bautista-Llacer R, Drummond J, Webster J, Johnson G, Morley T, Chan YS, Blows F, Coulson D, et al. The DrosDel deletion collection: a Drosophila genomewide chromosomal deficiency resource. Genetics. 2007;177:615–629. doi: 10.1534/genetics.107.076216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Sander JD, Joung JK. CRISPR-Cas systems for editing, regulating and targeting genomes. Nature biotechnology. 2014;32:347–355. doi: 10.1038/nbt.2842. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995;270:467–470. doi: 10.1126/science.270.5235.467. [DOI] [PubMed] [Google Scholar]
  72. Schilling RK, Marschner P, Shavrukov Y, Berger B, Tester M, Roy SJ, Plett DC. Expression of the Arabidopsis vacuolar H(+)-pyrophosphatase gene (AVP1) improves the shoot biomass of transgenic barley and increases grain yield in a saline field. Plant biotechnology journal. 2014;12:378–386. doi: 10.1111/pbi.12145. [DOI] [PubMed] [Google Scholar]
  73. Shin SY, Fauman EB, Petersen AK, Krumsiek J, Santos R, Huang J, Arnold M, Erte I, Forgetta V, Yang TP, et al. An atlas of genetic influences on human blood metabolites. Nature genetics. 2014;46:543–550. doi: 10.1038/ng.2982. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Sivasundar A, Hey J. Sampling from natural populations with RNAI reveals high outcrossing and population structure in Caenorhabditis elegans. Current biology : CB. 2005;15:1598–1602. doi: 10.1016/j.cub.2005.08.034. [DOI] [PubMed] [Google Scholar]
  75. Skarnes WC, Rosen B, West AP, Koutsourakis M, Bushell W, Iyer V, Mujica AO, Thomas M, Harrow J, Cox T, et al. A conditional knockout resource for the genome-wide study of mouse gene function. Nature. 2011;474:337–342. doi: 10.1038/nature10163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Smith L, Hood L. Mapping and Sequencing the Human Genome - How to Proceed. Bio-Technol. 1987;5:933–939. [Google Scholar]
  77. Soussi T, Legros Y, Lubin R, Ory K, Schlichtholz B. Multifactorial analysis of p53 alteration in human cancer: a review. International journal of cancer Journal international du cancer. 1994;57:1–9. doi: 10.1002/ijc.2910570102. [DOI] [PubMed] [Google Scholar]
  78. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci. 2005;102:15545–15550. doi: 10.1073/pnas.0506580102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Taylor BA, Heiniger HJ, Meier H. Genetic analysis of resistance to cadmium-induced testicular damage in mice. Proceedings of the Society for Experimental Biology and Medicine Society for Experimental Biology and Medicine. 1973;143:629–633. doi: 10.3181/00379727-143-37380. [DOI] [PubMed] [Google Scholar]
  80. Valdar W, Solberg LC, Gauguier D, Burnett S, Klenerman P, Cookson WO, Taylor MS, Rawlins JN, Mott R, Flint J. Genome-wide genetic association of complex traits in heterogeneous stock mice. Nature genetics. 2006;38:879–887. doi: 10.1038/ng1840. [DOI] [PubMed] [Google Scholar]
  81. Wang JT, Williams RW, Manly KF. WebQTL - Web-based complex trait analysis. Neuroinformatics. 2003;1:299–308. doi: 10.1385/NI:1:4:299. [DOI] [PubMed] [Google Scholar]
  82. Wasinger VC, Cordwell SJ, Cerpa-Poljak A, Yan JX, Gooley AA, Wilkins MR, Duncan MW, Harris R, Williams KL, Humphery-Smith I. Progress with gene-product mapping of the Mollicutes: Mycoplasma genitalium. Electrophoresis. 1995;16:1090–1094. doi: 10.1002/elps.11501601185. [DOI] [PubMed] [Google Scholar]
  83. Williams RW, Bennett B, Lu L, Gu J, DeFries JC, Carosone-Link PJ, Rikke BA, Belknap JK, Johnson TE. Genetic structure of the LXS panel of recombinant inbred mouse strains: a powerful resource for complex trait analysis. Mamm Genome. 2004;15:637–647. doi: 10.1007/s00335-004-2380-6. [DOI] [PubMed] [Google Scholar]
  84. Williams RW, Gu J, Qi S, Lu L. The genetic structure of recombinant inbred mice: high-resolution consensus maps for complex trait analysis. Genome Biol. 2001;2:RESEARCH0046. doi: 10.1186/gb-2001-2-11-research0046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Winzeler EA, Shoemaker DD, Astromoff A, Liang H, Anderson K, Andre B, Bangham R, Benito R, Boeke JD, Bussey H, et al. Functional characterization of the S-cerevisiae genome by gene deletion and parallel analysis. Science. 1999;285:901–906. doi: 10.1126/science.285.5429.901. [DOI] [PubMed] [Google Scholar]
  86. Wu Y, Williams EG, Dubuis S, Mottis A, Jovaisaite V, Houten SM, Argmann CA, Faridi P, Wolski W, Kutalik Z, et al. Multilayered genetics and omics dissection of mitochondrial activity in a mouse reference population. Cell. 2014;158 doi: 10.1016/j.cell.2014.07.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Xu JJ, Zhao QA, Du PN, Xu CW, Wang BH, Feng Q, Liu QQ, Tang SZ, Gu MH, Han B, et al. Developing high throughput genotyped chromosome segment substitution lines based on population whole-genome re-sequencing in rice (Oryza sativa L.) BMC genomics. 2010;11 doi: 10.1186/1471-2164-11-656. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Xu XM, Møller SG. The value of Arabidopsis research in understanding human disease states. Current opinion in biotechnology. 2011;22:300–307. doi: 10.1016/j.copbio.2010.11.007. [DOI] [PubMed] [Google Scholar]
  89. Yamamoto S, Jaiswal M, Charng WL, Gambin T, Karaca E, Mirzaa G, Wiszniewski W, Sandoval H, Haelterman NA, Xiong B, et al. A Drosophila Genetic Resource of Mutants to Study Mechanisms Underlying Human Genetic Diseases. Cell. 2014;159:200–214. doi: 10.1016/j.cell.2014.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Yeo GS, Farooqi IS, Aminian S, Halsall DJ, Stanhope RG, O’Rahilly S. A frameshift mutation in MC4R associated with dominantly inherited human obesity. Nature genetics. 1998;20:111–112. doi: 10.1038/2404. [DOI] [PubMed] [Google Scholar]
  91. Yuan J, Njiti VN, Meksem K, Iqbal MJ, Triwitayakorn K, Kassem MA, Davis GT, Schmidt ME, Lightfoot DA. Quantitative trait loci in Two Soybean Recombinant Inbred Line Populations Segregating for Yield and Disease Resistance. Crop science. 2002;42:271–277. doi: 10.2135/cropsci2002.2710. [DOI] [PubMed] [Google Scholar]
  92. Zamir D. Botany. A wake-up call with coffee. Science. 2014;345:1124. doi: 10.1126/science.1258941. [DOI] [PubMed] [Google Scholar]
  93. Zhao L, Liu F, Xu W, Di C, Zhou S, Xue Y, Yu J, Su Z. Increased expression of OsSPX1 enhances cold/subfreezing tolerance in tobacco and Arabidopsis thaliana. Plant biotechnology journal. 2009;7:550–561. doi: 10.1111/j.1467-7652.2009.00423.x. [DOI] [PubMed] [Google Scholar]

RESOURCES