Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Mar 1.
Published in final edited form as: Trends Genet. 2020 Sep 30;37(3):251–265. doi: 10.1016/j.tig.2020.09.007

Mouse genetic reference populations: cellular platforms for integrative systems genetics

Emily Swanzey 1, Callan O’Connor 1,2, Laura Reinholdt 1,*
PMCID: PMC7889615  NIHMSID: NIHMS1627364  PMID: 33010949

Abstract

Interrogation of disease-relevant cellular and molecular traits exhibited by genetically diverse cell populations enables in vitro systems genetics approaches for uncovering the basic properties of cellular function and identity. Primary cells, stem cells, and organoids derived from genetically diverse mouse strains, such as Collaborative Cross and Diversity Outbred populations, offer the opportunity for parallel in vitro / in vivo screening. These panels provide genetic resolution for variant discovery and functional characterization, as well as disease modeling and in vivo validation capabilities. Here we review mouse cellular systems genetics approaches for characterizing the influence of genetic variation on signaling networks and phenotypic diversity, and we discuss approaches for data integration and cross-species validation.

Keywords: Systems genetics, integrative genomics, quantitative trait locus, genetic reference population, collaborative cross, diversity outbred

Connecting Complex Traits with Causal Genetic Variants

Variation in physiological and molecular phenotypes across individuals and organisms is a central evolutionary property that is, in part, determined by genetic variation. These traits can range from cellular morphology to behavior and disease susceptibility and are typically influenced by multiple genes. In fact, the ability of genetic background to modulate expression of genes implicated in purported Mendelian traits (see Glossary) suggests few truly monogenic traits actually exist, and most are indeed genetically complex, with multiple variants of varying effect sizes contributing to a phenotype [1-3]. However, determining the genes involved in these complex traits still poses a challenge to research geneticists due lack of genetic resolution, epistasis, and complex gene-environment (GxE) interactions, along with the dilemma that results from model organism studies involving limited genetic variation often do not generalize to genetically diverse human populations [4].

The expanding complexity of genotype-phenotype relationships revealed by high throughput genomic technologies has prompted a surge in population genetics research in recent years. Genome-wide association studies (GWASs), for example, aim to correlate the frequency of single nucleotide polymorphisms (SNPs) with diseases or traits. While GWASs have been successful in identifying many loci associated with complex traitsi, the results from these studies often do not provide evidence for the specific underlying biological mechanisms. This is, in part, due to the fact that most associations are found in regions with similarly correlated SNPs (ie. linkage disequilibrium; [5]), making it challenging to pinpoint causal variants. In addition, many associated SNPs have been found in intergenic regions, suggesting the existence of underlying regulatory mechanisms that cannot be dissected with correlative studies alone. Finally, many common variants in the human population only subtly influence the trait of interest, and therefore large sample sizes are required to detect these small effects. Altogether, sample size requirements and inability to functionally interrogate pathways in human cellular networks have often limited our understanding of GWAS loci.

To bridge the gap between genotype and phenotype, approaches were needed to integrate molecular data with genotype and phenotype data. Systems genetics provided such a platform where multiple phenotypes, including cellular and molecular traits, can be collected from a genetic reference population (GRP) and associated with genetic variation using statistical genetics. With this approach, phenotype-regulating genomic regions, known as quantitative trait loci (QTL), can be identified in a genetically diverse population with fully sequenced genomes and catalogs of phenotype data. Recent advances in the scalability of high-throughput sequencing technologies have enabled large-scale acquisition of transcriptomic, epigenetic, proteomic and metabolomic data across different cell types, treatment groups and phenotypes. With these molecular data, GRPs can be associated with specific omics signatures, providing a robust means of connecting traits with causal genetic variants and cellular network perturbations, and revealing a deeper understanding of the molecular cause of interindividual variation.

Cellular Models from Genetic Reference Populations

The power of systems genetics comes from the ability to integrate a multitude of cellular and molecular datasets for genetically diverse GRPs, enabling multidimensional analyses. In this framework, in vitro GRP models with renewable cell populations offer attractive and scalable platforms for systems genetics. A GRP with sufficient genetic diversity can be simulated in cell lines through mutagenic approaches such as chemical mutagenesis or CRISPR/Cas9-mediated perturbations, which have been used for genome-wide screening of essential genes [6], enhancer elements [7, 8], and for saturation mutagenesis of contiguous chromosomal regions for cis-regulatory element discovery [9]. Alternatively, cellular GRPs can be created by deriving cells directly from a genetically diverse population. While some advantages exist in the former approach, natural diversity encompasses the full spectrum of variant types found in populations (e.g. copy number, structural and coevolved variants) and will be the focus of this review.

There are a variety of available model organism GRPs and ultimately, the choice of GRP depends not only on the ability of that organism to appropriately and economically model a particular disease, but also on diversity, strength, and haplotype structure of the alleles segregating in the population. Cellular GRPs can be created by deriving cell lines (“in vitro”) or culturing primary cells (“ex vivo”) directly from a genetically diverse population, thereby capturing the resident natural variation in those individuals [10-13]. The development and application of these cellular panels are active areas of genetic resource and technology development. With these GRPs, systems genetics and integrative genomics can be used to contextualize the cellular mechanisms of complex trait phenotypes. In this review we discuss how these tools can be used to approximate human genetic variation in cellular systems to dissect network regulation and advance the field of systems genetics.

Genetically diverse human cell panels

Large scale international human genome sequencing projects like the 1,000 genomes project have produced collections of genetically diverse lymphoblastoid cell lines (Table 1) that have been used in a range of genetic studies, demonstrating the power of in vitro-generated molecular QTL data for fine mapping of cis regulatory variation [14], improving GWAS resolution [15, 16], and discovering genetic variation underlying differential GxE responses [15, 17]. These panels of cells have extensive genomic and molecular data that are publicly available [18]. Such resources demonstrate the power of cellular platforms for systems genetics in the human population but they are limited to essentially one cell type.

Table 1.

Examples of existing systems genetics reference populations and cell lines (where applicable) for in vitro applications.

Organism Genetics Panel Derivation System Sources
Human Heterogeneous HipSci iPSC lines derived from ~150 rare disease and ~500 healthy donors hiPSCs [19-21]
Heterogeneous iPSCORE hiPSC lines derived from ~220 donors including matched familial and twin lines hiPSCs [10]
Heterogeneous Lymphoblastoid cell lines Derived from peripheral blood of donor populations [e.g. 1,000 Genomes, HapMap Projects, NHGRI Human Genetics Cell Repository] LCLs [103], Coriell Cell Repository
Mouse Inbred strains Hybrid mouse diversity panel (HMDP), Mouse Phenome Panel Collections of diverse and well characterized inbred strains MEFs, limited iPSCs, limited mESCs [12, 58, 104, 105], The Jackson Laboratory
RI & RI advanced intercross BXD Strains developed by crossing C57BL/6J and DBA/2J mice Ex vivo only to date [106, 107]
RI Collaborative Cross (CC) Highly diverse strains derived from an eight-way cross of A/J, C57BL/6J, 129S1/SvImJ, NOD/ShiLtJ, NZO/HlLtJ, CAST/EiJ, PWK/PhJ, and WSB/EiJ mice Ex vivo, limited mESCs, MEFs [13, 108, 109]
Outbred Diversity Outbred (DO) Panel developed by random outcross matings of 160 CC mice and maintained by random matings that avoid sibling crosses Ex vivo, mESCs [110], Predictive Biology
Saccharom yces cerevisiae RI SGRP-4X Panel developed from a four-way cross and then intercrossing for 12 generations Whole organism [34]
Outbred 18F12 Two populations developed from 18 genetically diverse founder strains Whole organism [33]

Pluripotent cell lines offer distinct advantages over primary cells because they can be differentiated into a potentially unlimited variety of cell types (Table 1) [19-21]. Both the NextGen Consortium and the Human Induced Pluripotent Stem Cell Initiative (HipSci) have been generating reference sets of hiPSC lines from healthy donors and patients with rare genetic diseases for a host of research purposes [10, 11]. Recently, genetic mapping studies of molecular traits in these cellular GRPs have revealed the role of genetic variation in the regulation of intrachromosomal chromatin accessibility and gene expression in iPSCs [22]. Moreover, elegant “village in a dish” approaches have been used to develop Census-seq, which is a genetic mapping approach that associates cellular phenotypes to individual genotypes cultured in pools [23]. However, hurdles associated with population structure, limited genetic diversity in some panels, and heterogeneity in human environmental exposure must be carefully considered.

Cell panels from model organism GRPs

Model organism GRPs have been precisely designed to offer increased statistical power (e.g. mouse: [24-26], Drosophila: [27], and yeast: below) and an opportunity for systems genetic discovery with in vivo correlation and validation capabilities. One of the pillars of model organism genetic mapping studies has been the analysis of pairwise crosses between two genetically divergent founder strains (Figure 1A). With the use of large populations with segregating alleles, phenotypes can be mapped to genotypes. This approach has been particularly rewarding in yeast, where QTL have been shown to explain over 70% of the narrow-sense heritability of traits [28-33]. However, these panels offer extreme under sampling of the functional variation that could segregate in natural populations. For this reason, multi-parent populations (MPPs) have gained popularity in their ability to combine fine mapping resolution with increased genetic variation from the founders. MPP generation is accomplished by crossing several inbred founder strains to one another, thereby increasing genetic diversity, and then further introgression to capture the products of accumulating recombination events. These individuals can also be inbred to create a panel of recombinant inbred (RI) lines that are homozygous, reproducible genetic mosaics of the founder strains. Such MPP panels have now been generated for in vitro studies in Saccharomyces cerevisiae using 4 and 18 fully sequenced founder strains [33, 34] (Table 1).

Figure 1. Generation of GRPs from Inbred Laboratory Strains.

Figure 1.

Breeding schemes for the generation of GRP populations. Different colors represent the genotypes of the respective inbred founder strain chromosomes. (A) For a standard RI GRP, two founder strains are crossed to generate an F1 population. F1 progeny are then crossed to generate F2 mice, followed by inbreeding to establish a panel of RI strains. (B) High diversity mouse GRPs were developed by crossing eight inbred founder strains (all possible founder cross combinations were generated). The progeny were crossed to develop the CC panel of RI strains, which combine high diversity with reproducibility. CC mice were then outbred to generate the DO panel, which lends itself to greater phenotypic variation, and higher resolution mapping given accumulated recombination events.

Mouse GRPs

Given the difficulties in conducting research with human samples and the phenotypic limitations of yeast GRPs, mammalian model systems, such as mice, have been invaluable tools for cross-species comparison (Table 1 and Figure 1B). Studies have repeatedly shown considerable overlap in the gene orthologs and pathways that contribute to complex traits between rodents and humans, highlighting the translatability of these systems [35-38]. In particular, RI (Figure 1A) and outbred mouse populations provide a powerful genetic resource to study cellular systems with in vivo and crosss-pecies validation capabilities (Table 1 and Figure 1B). The ability to continually regenerate these mice or comparable genotypes has enabled researchers to compile data across cell types and environmental conditions (reviewed by [39]). The mouse systems genetics field has been generating such data for decades, primarily with genetically diverse inbred, consomic, and RI strains like the BXD mouse panel (Table 1), among other panels (reviewed by [40]). These data are housed in databases such as GeneNetworkii and the Mouse Phenome Databaseiii where they can be used for further trait correlation and post-hoc analyses. The BXD panel is an invaluable tool with moderate genetic variation that allows for more facile discovery of causative variants with a range of effect sizes.

While the BXD captures 13% of known genetic variation in mice, the more recently developed Collaborative Cross (CC) and Diversity Outbred (DO) MPP panels captures over 90% of the genetic variation in laboratory mice and with that, greater phenotypic diversity [41]. CC and DO panels were derived from eight inbred parental strains and capture approximately 45 million segregating polymorphisms [42], representing 94 million years of evolutionary divergence. This can be compared to ~30 years of divergence captured in a substrain reduced complexity cross (reviewed by [43]), ~100 years captured in a standard two way RI cross involving common laboratory inbred strains of the same subspecies, and ~40,000 years of divergence captured in much of the human population after the great expansion [44, 45]. The CC panel is composed of RI strains designed to have controlled genomic randomization that is normally distributed across the genome, enabling high reproducibility for systems genetics studies (Table 1, Figure 1B). The DO population was generated by random outcrossings of CC strains to create a population of mice with abundant and broadly distributed genetic diversity, high allelic heterozygosity, balanced founder allele frequencies, and accumulated recombination events that provide high resolution for genetic mapping (Table 1, Figure 1B). The mating scheme used to generate and maintain the DO doubles the effective population size and minimizes the impact of selection and genetic drift on allele frequencies, increasing the statistical power of the DO panel and reducing the burden of sample size [46, 47]. CC strains offer the advantage of replicable genomes, while every DO mouse is genetically unique and cannot be replicated at the genome level. However, selection of CC inbred founder strains based on genetic variation discovered in the derivative DO population allows for identification of replicable genotypes in specific regions of interest.

While there is clear utility of these panels, in vitro application of genetically diverse cell systems is still a largely untapped resource due to lack of available GRP cell panels and, until recently, lack of scalable genomic technologies. Increased availability of these resources will provide new opportunities to scale and integrate systems genetic discovery in organisms that are amenable to in vivo validation and modeling human disease (e.g. [13]). This approach lends itself to large-scale studies that are economically feasible, while providing a wealth of information in an ethical way (e.g. limiting animal use).

In vitro systems genetics applications

Molecular and cellular traits for systems genetic analysis

Genetic variation determines phenotypic variation through a hierarchy of molecular and cellular phenotypes. In addition to coding variants that directly affect protein stability and/or function, these can include non-coding regulatory variants that dictate proximal molecular configurations like chromatin state that are then reflected in transcript abundance, translational control mechanisms, post translational modifications and protein interactions. These molecular phenotypes can then influence cell function, tissue composition, and ultimately physiological states (Figure 2A). Statistical associations between genetic variation and quantitative measurements along any level of this hierarchy of traits identify the relative contributions of loci harboring functional allelic variants.

Figure 2 (Key Figure). Data Generation Workflow for the Study of In Vitro Systems Genetics.

Figure 2 (Key Figure).

(A) Potential cellular applications of DO and CC mouse panels: studying the influence of genetic diversity on cell state transitions, cellular function and response to perturbations in screening applications. (B) Workflow for the generation of systems genetics datasets from a high diversity mouse GRP (CC or DO). Cell lines are derived from GRP mice and phenotype data is acquired for each line: cellular traits such as cell size and viability, in addition to molecular traits from omics datasets. By integrating SNP genotype data from each cell line, the genomic region containing variant(s) implicated in phenotypic variation can then be identified with QTL mapping of each trait. Regions with statistically significant logarithm of the odds (LOD) peaks can be dissected for SNPs associated with each of the inbred founder strains to determine the strain(s) containing the causal variant (represented in the EFFECT plot).

For molecular traits, next-generation sequencing (NGS) platforms have made obtaining sequencing data cheaper and more robust than ever (Figure 2B). For example, expression QTL (eQTL) analysis, correlates mRNA transcript abundance and can be used to generate co-expression networks and develop correlations between gene expression levels and the functional phenotype of interest or with existing GWAS data [48]. In combination with other types of molecular QTL analysis, this approach not only enables precise mapping of causal genes but also facilitates an investigation across biological scales and across species (e.g. [49] and reviewed in [50, 51]). With this, the high reproducibility and high genetic diversity of CC and DO mice, respectively (Figure 1B), enables fine variant effect resolution and a greater understanding of network regulation. Molecular QTLs are now used extensively in systems genetic research, however, approaches for QTL mapping and integration of heritable in vitro cellular traits are just emerging. Heritability estimates of in vitro traits can be quickly and easily accomplished using cell lines derived from GRPs. Here replicable individuals with identical genotypes allow for quantitative estimates of interindividual variability due to intrinsic (genetic) and a variety of extrinsic (e.g. culture conditions) variables.

Simulating developmental processes in vitro

Recent in vitro studies in mouse iPSCs and embryonic stem cells have linked genetic variation of inbred strains to instability of genomic imprinting [52], maintenance of pluripotency [12, 13], and early lineage commitment [53], all of which are important when considering quality and developmental potential of cell lines. With advancements in cell culture and differentiation technologies, CC and DO pluripotent cell lines provide an opportunity to study the influence of genetic variation on developmental processes in vitro. This includes using the aforementioned systems genetics tools to interrogate the ability of genetically diverse lines to undergo cell state transitions (e.g. stem cell to differentiated cell type) or to properly function in their undifferentiated and terminally differentiated states (Figure 2A). More sophisticated differentiation systems, such as organoids and other specialized differentiation protocols can also be implemented to simulate in vivo cell niche conditions. Finally, an additional advantage of using genetically diverse mouse pluripotent cell panels is that in vitro findings can be further validated in vivo with manipulation of the early embryo (e.g. cloning DO individuals by tetraploid blastocyst injection of DO mESCs or by selecting CC pluripotent cells carrying specific combinations of alleles) or by following genetically matched mice or comparable genotypes during development.

Precision medicine

GxE studies employ genetic approaches (QTL, GWAS) to identify interactions between genotype and variation in response to environmental stimuli. Using these approaches interindividual variation in pharmacological treatments and environmental exposure responses have been linked to differences in genetic background (reviewed by [54, 55]). For example, genetically diverse lymphoblastoid cell lines and primary cells from human populations have been used to map differential / response eQTL and chromatin accessibility QTL (caQTL) [15, 17, 56], providing a framework for systems genetics interrogation of such GxE interactions in panels of cells from GRPs. While in vivo drug screening in mice has been traditionally conducted on a single, F1 genetic background (essentially the equivalent of one human), there is now a multi-disciplinary shift towards more inclusionary screening (ie. genetic diversity) [57, 58]. While this increased dimensionality lends itself to more accurate drug discovery, the increase in scale is more feasible in vitro. With this, even large numbers of genetically diverse cell lines can be screened across panels of drugs (Figure 2A). Pharmacogenetic screening has been implemented ex vivo using primary cells from genetically diverse mouse strains [59], tissue extracts [60], and fibroblasts [58]. Here, traditional drug screening phenotype parameters were automated to record cellular traits (proliferation, viability, metabolism), in addition to gene expression analyses and haplotype mapping to capture molecular traits and dissect response phenotypes.

High throughput cellular screening

Technological advancements and innovative applications of classic cellular assays have enabled high throughput cellular screening in increasingly complex in vitro systems. Intracellular and extracellular high-throughput cellular phenotyping methods can be loosely split into two categories: 1) clinical-based assays and 2) fluorescence-based assays. Clinical-based assays consist of liquid chromatography, mass spectrometry, seahorse assays, and other kit-based assays, like enzyme-linked immunosorbent assays (ELISAs), that often rely on cellular lysates for signal detection. Alternatively, high throughput fluorescence-based assays like flow cytometry (FC) and high content screening (HCS) platforms offer the means to detect a wide variety of cellular and molecular traits at single cell resolution. Each of these assays allow for the rapid detection and quantification of molecular biomarkers of cellular states (e.g. oxidative stress, apoptosis) or cellular function / response (e.g. calcium signaling, electrophysiology). A review of these methodologies in cells from genetically diverse mouse strains is provided in [61].

Using HCS for investigations of cellular xenobiotic exposure responses has been widely adopted within the pharmacology and toxicology communities [62-65]. HCS automates imaging of treatable microplates using fluorescent, brightfield, or digital phase contrast channels with the choices of resolution, optics, and z-stacking for 3D analysis. Thousands of molecular and functional endpoints are obtainable per cell during each HCS experiment, and image analysis software can detect cellular traits such as cytotoxicity, fluorescence detection and localization, and cellular morphology. Introducing genetic diversity into HCS adds an additional dimension to the data generated by these experiments. For the vast majority of cellular traits that can be measured using HCS, there is no a priori knowledge of interindividual variability, highlighting the importance of quality control measures to mitigate outliers, bias, and heteroscedasticity in heterogeneous cellular responses (for methodologies see [66]), as well as preliminary experiments to determine inter- and intraindividual variability prior to screening [56]; for mouse cellular GRPs, the latter is easily accomplished in cell lines derived from diverse inbred strains [13]. The sheer amount of data generated by HCS can lead to analysis bottlenecks, therefore data reduction techniques like feature-feature correlation matrices [66] and the clustering of multiple endpoints are used to define distinct cellular profiles [67, 68]. Machine learning tool suites like CellProfiler [69] and PHENOLogic by PerkinElmer are also used to detect cellular traits in an unbiased fashion, including those indiscernible by eye, to define differential responses within and between treatment groups.

Data integration and validation in mouse GRPs

The goal of systems genetics is to identify genes within QTLs to reconstruct molecular pathways underlying complex phenotypes in an unbiased manner. Although outbred mapping populations like the DO panel have relatively small haplotype blocks compared to F2 intercrosses or RI strains, linkage disequilibrium is still a challenge, and QTL regions may still include many candidate genes and variants depending on the relatedness of the strains involved. Integration of heterogeneous molecular, cellular, and physiological data is a powerful approach to increasing resolution, as these data converge at the pathway level. This convergence means that molecular or cellular data, which offer economies of scale, can be used to approximate complex physiological or behavioral phenotypes that might be difficult to obtain, expensive, or subject to experimental variability.

Modeling networks

Based on the premise that a significant fraction of the genetic variation that underlies heritable trait variability influences gene expression (reviewed by [70]), RNAseq datasets are widely utilized for network modeling. Differential gene expression analysis (e.g. DESeq2 [71]), followed by gene set enrichment analysis [72, 73], analysis of functional gene and pathway annotation (e.g. Gene Ontology [74], KEGG [75], and a growing number of disease-centric published gene sets) are used to identify gene expression signatures (Figure 3). These are correlated with quantitative or categorical phenotypes (e.g. susceptible vs. resilient) from genetically diverse samples, and across species. In these cases, pathways constructed from gene expression data can provide therapeutic leads even in the absence of genetic mapping [38, 76].

Figure 3. Data Integration for In Vivo Validation and Cross-species Predictive Modeling.

Figure 3.

Overall workflow for systems genetics data analysis: in brief, molecular data can be used to model gene expression networks and identify relevant genes with principal component analysis (PCA) and gene set enrichment analysis (GSEA). Trait data can be combined to analyze module-trait correlations and trait-trait correlations. QTL mapping anchors these molecular and trait data to loci. This can be integrated with human GWAS or QTL data to deduce cross-species correlations and conserved synteny. These findings can then be validated in vivo in existing mouse models or engineered with gene editing. Finally, validated genes and/or cellular network pathways can be applied to human systems to inform risk for disease prevention, or as targets for pharmacological intervention or tailored gene therapies.

Gene network construction on the basis of correlated expression can be accomplished through weighted gene co-expression network analysis (WGCNA) [77]. This approach identifies groups, or “modules”, of genes with highly correlated expression patterns. Modules can be summarized using dimensional reduction techniques (e.g. principal component analysis), which can then be related to one another or assigned to traits (Figure 3). Bayesian network algorithms can then be applied to modules to reconstruct directional gene-gene relationships and to identify potential regulatory elements (reviewed by [78]). Correlation networks can also facilitate network-based genetic screening methods which are used to identify candidate biomarkers or therapeutic targets (Figure 3). Keller et at. demonstrated that module “eigengenes” (ie. the first principal component of scaled module expression profiles) can be correlated with specific molecular and cellular traits, and mapped QTLs responsible for interindividual variation in these module-trait networks [79]. These types of approaches essentially collapse multidimensional (from multiple cell types, tissues and/or multiple data types) data into summary phenotypes, which can then be used for genetic mapping [80].

Integration of in vivo and in vitro networks

Taking advantage of tools for data integration and correlation, in vitro correlates of in vivo traits can be established through iterative testing and refinement of cell-based models and then implemented for in vitro systems genetics applications. However, in vitro platforms are reductive (often involving a single cell type and cell autonomous phenotypes) and should not be overinterpreted without sufficient knowledge of the underlying cell biology and its relevance. In this regard, integrative genetic studies demonstrating convergence of in vitro and in vivo-generated genetic maps can provide a data-driven rationale for supplanting in vivo screens with in vitro platforms.

Primary cells or cell lines from GRPs provide scalable platforms capturing molecular and cellular traits, which can be added to a compendium of phenotypes that can be evaluated on the same genotypes in a genetically diverse population. Keller et al. demonstrated this type of systems approach for mapping of Type 2 Diabetes (T2D) [81]. In this study, islets were harvested from over 350 genetically diverse (J:DO) mice and ex vivo insulin and glucagon secretion phenotypes were measured and integrated with extensive in vivo T2D phenotyping. Here, simple pairwise calculations of Pearson’s correlation coefficient between all traits allowed for correlation between in vivo and ex vivo measurements. The genetic architecture of T2D was revealed through genetic mapping of highly correlated ex vivo traits and downstream validation efforts focused on QTL hotspots to which multiple traits mapped [81]. These networks were then used to demonstrate that variation in any element, module, or principal component can predict phenotypic outcomes, making these likely candidate biomarkers of physiological states.

Cross-species data integration

As described above, when considering the consequences of genetic variation across species, similarities are more often observed at the level of pathways rather than individual variants. For this reason, methodologies for cross-species data integration focus on comparisons of orthologous genes, pathways, or networks [82, 83]. For example, a straightforward method of cross-species integration using gene expression data is to test for concordance between gene expression datasets derived from well-matched tissue collections. Core expression signatures or expression modules (e.g. WGCNA) from different species are tested for concordance within a study and/or compared to modules predicted from other publicly available eQTL or GWAS studies [76, 84-88]. Methods like these have been adapted for high dimensionality, where single cell expression data are transformed to biological process profiles or pathways that are then compared across species [89].

Gene level integration of SNP-level genetic data (QTL and GWAS) takes advantage of known orthology and synteny relationships. This approach relies on conversion of SNP-level QTL data to nearest genes (QTL genes) [90] and uses existing mouse-human homology and synteny annotations (e.g. using Ensembl [91, 92]) for cross-species integration. These integrated genetic data can be used in risk assessment. For example, risk scores can be derived from model organism genetic data, which provide quantitative metrics of susceptibility [38]. These cross-species genomic relationships can be depicted in circular (circos) plots [93] (Figure 3). If GWAS data from related (same disease or related phenotypes) human population studies are available, further integration of such data can help to identify translationally relevant QTLs for validation. Human SNPs associated with a disease can be accessed through GWAS Centraliv and the number of associated SNPs within 1 Mb windows can be mapped to regions that share conserved synteny with QTL or can be used to test for overlap with modules constructed from molecular QTL data.

To test for an association of a gene with a particular phenotype or disease, the use of summary statistics is gaining in popularity, fueled in part by the increasing number of available GWAS datasets for computation. These methods derive gene-based P-values from published SNP-level GWAS data values using modified Simes tests like GATES [94] or combined tests like COMBAT [95]. These gene-level association values are then used to enhance power for disease association in GWASs [96], animal linkage mapping studies [97] or for cross species integration with QTL from animal studies (e.g. [81, 98]).

Building predictive models and validation

The end goal of data integration is the construction of testable models of gene function in complex phenotypes and disease (Figure 3 and Box 1). Well-characterized biological pathways frequently benefit from decades of basic science that led to the identification of druggable targets and interventions that can be found in publicly available databases [99, 100]. In these cases, molecular pathways identified through systems genetics approaches can be manipulated with known pharmacological agents to demonstrate causality, as well animal model validity. Where less information is available, validation efforts focus on specific QTL genes or variants. Existing collections of engineered individuals, e.g. collections of gene knockouts (International Mouse Phenotyping Consortiumv [101]), on a single genetic background provide resources for candidate gene functional validation if simple loss of function data are informative. For testing combinations of alleles discovered in genetically diverse GRPs like the DO, where every individual is unique, predictive models of the impact of genotype on phenotype can be recursively tested in individuals carrying certain combinations of alleles selected from the replicable CC population [13]. Germline and somatic cell gene editing technologies enable a range of genome engineering outcomes from classic overexpression (e.g. transgenic) or gene knockouts to precise gene editing, which allows for testing the effects of specific variants.

Box 1. Web-based Applications for Integrative Genomics.

To support integrative genomics, there is a suite of web-based tools that allow for comprehensive integration of heterogeneous molecular data across species at the gene-level. These tools employ network-based methods for integration of different data types and existing genomic data resources (GTEx, ENCODE, dbGAP, GEO, various model organism databases) and genomic annotation databases (genome browsers, dbSNP/EVA, GENCODE, and others). For example, tools like GeneWeaver aim to provide platforms that operate on user uploaded or selected gene lists (sets) and enable integration with curated biological data on gene function. New relationships and complex intersections are revealed through hierarchical similarity graphs or Boolean algebra [83, 111]. The development of statistical methods for integrative genomics is a rapidly expanding field that has been reviewed extensively elsewhere [e.g. [112]], and is advancing towards the collective goal of predictive modeling and precision medicine. Ultimately the accuracy of these tools depends on on-going functionalization of the genome through basic research in model organisms [e.g. KOMP], widespread development and implementation of genome technologies, availability and accurate curation of genomic data into well-supported accessible databases and bioinformatic resources, including model organism databases (Alliance of Genome Resources).

Sophisticated and high throughput technologies for genome engineering are well established in laboratory mice and reviews of contemporary approaches are available elsewhere [102]. However, decades of technology development in mouse genetic engineering has relied almost exclusively on just a few inbred strain backgrounds (i.e. C57BL6, 129 substrains). Mouse GRPs like the CC/DO integrate divergent strain backgrounds that have proven more difficult to engineer (e.g. NOD substrains). Therefore, new advances in reproductive technologies, widespread integration of alternative reference genomes for construct design, and an expansion of “tool” strains (e.g. strains carrying ubiquitously expressed recombinases, Cas9 variants, recombinase mediated docking sites, etc.) on genetically diverse backgrounds will be required to fully leverage the power of reverse genetics for validation and mechanistic studies.

Concluding remarks

While genetics studies have traditionally focused on inheritance of single genes in association with a trait, it has been known for some time that phenotypic penetrance is often much more complicated than simple Mendelian genetics can account for. Investigation of the cause of phenotypic variation has benefited enormously from the development of high throughput technologies and computational techniques. These advances gave rise to systems genetics research, which aims to account for the breadth of complexity that exists in biological systems, synthesizing data across fields with a multitude of quantitative and qualitative approaches.

Still, the integration and availability of genetic tools for systems genetics applications has lagged behind the technological advancements. The development of the DO and CC panels opened up the ability to study cellular systems in a controlled environment with tractable, economical, and translatable genetic platforms. The ease of access to mouse biological material has enabled researchers to develop a multifaceted view of in vivo cellular systems, and an opportunity now exists to leverage advancements in cell culture technologies for the development of highly scalable in vitro systems. These tools will open up new opportunities to account for gene-gene and gene-environment interactions across a range of applications: treatment perturbations, specific cell types, developmental modeling, etc. Expanding the tool sets we have available for these systems genetics platforms will undoubtedly help to dissect the genetic architecture of complex traits and refine predictive human disease modeling (see Outstanding Questions).

Outstanding questions.

  • How can model organism genetic reference populations more accurately mimic human lifestyle and environmental exposures for predictive modeling of complex disease and advancement of precision medicine?

  • How can we best generate and disseminate DO and CC cell culture models, and enable computational analyses for systems genetics?

  • To what extent can in vitro traits provide true insight into the genetics of complex diseases?

Highlights.

  • Genetically diverse mice, such as Diversity Outbred (DO) and Collaborative Cross (CC) panels, can be used to model human genetic diversity and map genetic modifiers underlying phenotypic variation at high resolution and under controlled environmental conditions.

  • Cell lines derived from DO and CC mice can be used to study systems genetics in a variety of cell culture applications: to understand the influence of genetics on cell state transitions, cellular function in a variety of cell types and in toxicology and pharmacogenetic screening.

  • DO and CC cell lines enable integrated in vitro and in vivo screening capabilities across the same genotypes for multidimensional systems genetics studies and validation.

  • Integrative genomics facilitates the identification of orthologous genes and cellular pathways that regulate human diseases and traits.

Acknowledgements

We would like to thank Jason Stumpff, Steve Munger and Catherine Kaczorowski for discussion and insightful comments on this review. Cellular systems genetics research in the Reinholdt laboratory is currently supported by grants from The Jackson Laboratory and NIH (P40OD011102 and R01ES029916).

Glossary

Bayesian network algorithms

Probabilistic graphical model made up of nodes and links. Each node represents a variable and each link represents the probability of dependence between two variables.

Cellular traits

An observable and measurable phenotype (e.g. proliferation, viability, morphology).

Complex traits

A trait that does not follow simple Mendelian inheritance patterns and is influenced by multiple genetic and/or environmental factors.

Census-seq

A method to pool cells of multiple individuals and measure their cellular phenotypes. The presence of each individual’s DNA is quantified in “cell villages” before and after selection for a cellular trait.

Correlation networks

Method to study cellular networks based on pairwise correlations between variables. Also known as co-expression network analysis (WGCNA).

Epistasis

Interaction between genes where the effect of one gene mutation is dependent on the presence or absence of one or more other gene mutations.

Expression QTL (eQTL)

A locus that explains the genetic basis for variation in expression of a particular gene. eQTL analysis is conducted by testing for associations between genetic variation and transcript levels from a particular gene.

Feature-feature correlation matrices

A matrix showing the correlation coefficients between two variables (e.g. molecular and/or cellular traits) to identify dependent and independent features.

Genetic reference population (GRP)

A reference population of individuals with known genotypes and phenotypes.

Genome-wide association studies (GWASs)

Statistical approach used to associate genetic variants with a particular trait in a population of individuals.

Genomic imprinting

Epigenetic marks acquired in the germline that confer parent-of-origin expression patterns of >100 genes, many of which encode important developmental regulators.

Heteroscedasticity

A change in the spread of residuals in regression analysis over a range of measured values.

High content screening (HCS)

High-throughput imaging approach that combines automated microscopy and image analysis for acquisition of a particular quantitative phenotype (fluorescence, cell growth, morphology, etc).

Linkage disequilibrium

The non-random association of alleles of different genes in a population.

Mendelian traits

Traits that are inherited by dominant or recessive alleles from a single gene.

Molecular traits

Omics signature assigned to a particular genotype.

Narrow-sense heritability

The proportion of phenotypic variance that is due to additive genetic factors (not factoring in the impact of gene x gene interactions and dominance).

Pearson’s correlation coefficient

Statistic that measures linear correlation between two variables, specifically it measures covariance of the variables divided by the product of their standard deviations.

Pluripotency

The ability of a cell to give rise to all cell types of the body.

Quantitative trait loci (QTL)

Genomic regions associated with variation of a particular phenotype.

Single nucleotide polymorphisms (SNPs)

Substitution of a single nucleotide at a particular position in the genome.

Systems genetics

An approach to understanding the biological mechanisms underlying complex traits using statistical methods to integrate intermediate molecular phenotypes (e.g. transcript, protein, DNA methylation) in populations where the trait of interest varies.

Xenobiotic

Chemical substances that are foreign to a particular species and elicits a physiological response including drugs, pesticides, industrial chemicals, cosmetics and environmental pollutants.

Footnotes

Resources

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Nadeau JH (2001) Modifier genes in mice and humans. Nat Rev Genet 2, 165–174 [DOI] [PubMed] [Google Scholar]
  • 2.Cooper DN, et al. (2013) Where genotype is not predictive of phenotype: towards an understanding of the molecular basis of reduced penetrance in human inherited disease. Hum Genet 132, 1077–1130 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Campbell M, et al. (2018) Utilizing random regression models for genomic prediction of a longitudinal trait derived from high-throughput phenotyping. Plant Direct 2, e00080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Valdar W, et al. (2006) Genome-wide genetic association of complex traits in heterogeneous stock mice. Nat Genet 38, 879–887 [DOI] [PubMed] [Google Scholar]
  • 5.Gallagher MD and Chen-Plotkin AS (2018) The Post-GWAS Era: From Association to Function. Am J Hum Genet 102, 717–730 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Shalem O, et al. (2014) Genome-scale CRISPR-Cas9 knockout screening in human cells. Science 343, 84–87 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Sanjana NE, et al. (2016) High-resolution interrogation of functional elements in the noncoding genome. Science 353, 1545–1549 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Korkmaz G, et al. (2016) Functional genetic screens for enhancer elements in the human genome using CRISPR-Cas9. Nat Biotechnol 34, 192–198 [DOI] [PubMed] [Google Scholar]
  • 9.Diao Y, et al. (2017) A tiling-deletion-based genetic screen for cis-regulatory element identification in mammalian cells. Nat Methods 14, 629–635 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Panopoulos AD, et al. (2017) iPSCORE: A Resource of 222 iPSC Lines Enabling Functional Characterization of Genetic Variation across a Variety of Cell Types. Stem Cell Reports 8, 1086–1100 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Streeter I, et al. (2017) The human-induced pluripotent stem cell initiative-data resources for cellular genetics. Nucleic Acids Res 45, D691–D697 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Garbutt TA, et al. (2018) Permissiveness to form pluripotent stem cells may be an evolutionarily derived characteristic in Mus musculus. Sci Rep 8, 14706. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Skelly DA, et al. (2020) Mapping the Effects of Genetic Variation on Chromatin State and Gene Expression Reveals Loci That Control Ground State Pluripotency. Cell Stem Cell [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Tehranchi A, et al. (2019) Fine-mapping cis-regulatory variants in diverse human populations. Elife 8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Nicolae DL, et al. (2010) Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet 6, e1000888. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Calderon D, et al. (2019) Landscape of stimulation-responsive chromatin across diverse human immune cells. Nat Genet 51, 1494–1505 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Theusch E, et al. (2020) Genetic variants modulate gene expression statin response in human lymphoblastoid cell lines. BMC Genomics 21, 555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Zheng Z, et al. (2020) QTLbase: an integrative resource for quantitative trait loci across multiple human molecular phenotypes. Nucleic Acids Res 48, D983–D991 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Carcamo-Orive I, et al. (2017) Analysis of Transcriptional Variability in a Large Human iPSC Library Reveals Genetic and Non-genetic Determinants of Heterogeneity. Cell Stem Cell 20, 518–532 e519 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.DeBoever C, et al. (2017) Large-Scale Profiling Reveals the Influence of Genetic Variation on Gene Expression in Human Induced Pluripotent Stem Cells. Cell Stem Cell 20, 533–546 e537 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Kilpinen H, et al. (2017) Common genetic variation drives molecular heterogeneity in human iPSCs. Nature 546, 370–375 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Young Greenwald WW, et al. (2019) Chromatin co-accessibility is highly structured, spans entire chromosomes, and mediates long range regulatory genetic effects. bioRxiv [Google Scholar]
  • 23.Mitchell JM, et al. (2020) Mapping genetic effects on cellular phenotypes with “cell villages”. bioRxiv [Google Scholar]
  • 24.Gatti DM., et al. (2014) Quantitative trait locus mapping methods for diversity outbred mice. G3 (Bethesda) 4, 1623–1633 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Broman KW (2012) Haplotype probabilities in advanced intercross populations. G3 (Bethesda) 2, 199–202 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Liu Y, et al. (2018) Joint Analysis of Strain and Parent-of-Origin Effects for Recombinant Inbred Intercrosses Generated from Multiparent Populations with the Collaborative Cross as an Example. G3 (Bethesda) 8, 599–605 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.King EG., et al. (2012) Genetic dissection of a model complex trait using the Drosophila Synthetic Population Resource. Genome Res 22, 1558–1566 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Ehrenreich IM, et al. (2010) Dissection of genetically complex traits with extremely large pools of yeast segregants. Nature 464, 1039–1042 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Bloom JS, et al. (2013) Finding the sources of missing heritability in a yeast cross. Nature 494, 234–237 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Bloom JS, et al. (2019) Rare variants contribute disproportionately to quantitative trait variation in yeast. Elife 8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Bloom JS, et al. (2015) Genetic interactions contribute less than additive effects to quantitative trait variation in yeast. Nat Commun 6, 8712. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Martens K, et al. (2016) Predicting quantitative traits from genome and phenome with near perfect accuracy. Nat Commun 7, 11512. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Linder RA, et al. (2020) Two Synthetic 18-Way Outcrossed Populations of Diploid Budding Yeast with Utility for Complex Trait Dissection. Genetics 215, 323–342 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Cubillos FA, et al. (2013) High-resolution mapping of complex traits with a four-parent advanced intercross yeast population. Genetics 195, 1141–1155 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Langley SR, et al. (2013) Systems-level approaches reveal conservation of trans-regulated genes in the rat and genetic determinants of blood pressure in humans. Cardiovasc Res 97, 653–665 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Justice MJ and Dhillon P (2016) Using the mouse to model human disease: increasing validity and reproducibility. Dis Model Mech 9, 101–103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Winter JM, et al. (2017) Mapping Complex Traits in a Diversity Outbred F1 Mouse Population Identifies Germline Modifiers of Metastasis in Human Prostate Cancer. Cell Syst 4, 31–45 e36 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Neuner SM, et al. (2019) Harnessing Genetic Complexity to Enhance Translatability of Alzheimer's Disease Mouse Models: A Path toward Precision Medicine. Neuron 101, 399–411 e395 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Dunn AR, et al. (2019) Gene-by-environment interactions in Alzheimer's disease and Parkinson's disease. Neurosci Biobehav Rev 103, 73–80 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Li H and Auwerx J (2020) Mouse Systems Genetics as a Prelude to Precision Medicine. Trends Genet 36, 259–272 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Roberts A, et al. (2007) The polymorphism architecture of mouse genetic resources elucidated using genome-wide resequencing data: implications for QTL discovery and systems genetics. Mamm Genome 18, 473–481 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Keane TM, et al. (2011) Mouse genomic variation and its effect on phenotypes and gene regulation. Nature 477, 289–294 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Bryant CD, et al. (2020) Facilitating Complex Trait Analysis via Reduced Complexity Crosses. Trends Genet 36, 549–562 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Garrigan D, et al. (2007) Inferring human population sizes, divergence times and rates of gene flow from mitochondrial, X and Y chromosome resequencing data. Genetics 177, 2195–2207 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Henn BM, et al. (2012) The great human expansion. Proc Natl Acad Sci U S A 109, 17758–17764 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Churchill GA, et al. (2012) The Diversity Outbred mouse population. Mamm Genome 23, 713–718 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Rockman MV and Kruglyak L (2008) Breeding designs for recombinant inbred advanced intercross lines. Genetics 179, 1069–1078 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Hormozdiari F, et al. (2018) Leveraging molecular quantitative trait loci to understand the genetic architecture of diseases and complex traits. Nat Genet 50, 1041–1047 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Donovan MKR, et al. (2020) Cellular deconvolution of GTEx tissues powers discovery of disease and cell-type associated regulatory variants. Nat Commun 11, 955. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Neumeyer S, et al. (2020) Strengthening Causal Inference for Complex Disease Using Molecular Quantitative Trait Loci. Trends Mol Med 26, 232–241 [DOI] [PubMed] [Google Scholar]
  • 51.Ye Y, et al. (2020) A Multi-Omics Perspective of Quantitative Trait Loci in Precision Medicine. Trends Genet 36, 318–336 [DOI] [PubMed] [Google Scholar]
  • 52.Swanzey E, et al. (2020) A Susceptibility Locus on Chromosome 13 Profoundly Impacts the Stability of Genomic Imprinting in Mouse Pluripotent Stem Cells. Cell Rep 30, 3597–3604 e3593 [DOI] [PubMed] [Google Scholar]
  • 53.Ortmann D, et al. (2020) Naive Pluripotent Stem Cells Exhibit Phenotypic Variability that Is Driven by Genetic Variation. Cell Stem Cell [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Evans WE and Relling MV (1999) Pharmacogenomics: translating functional genomics into rational therapeutics. Science 286, 487–491 [DOI] [PubMed] [Google Scholar]
  • 55.Waters MD and Fostel JM (2004) Toxicogenomics and systems toxicology: aims and prospects. Nat Rev Genet 5, 936–948 [DOI] [PubMed] [Google Scholar]
  • 56.Lee MN, et al. (2014) Common genetic variants modulate pathogen-sensing responses in human dendritic cells. Science 343, 1246980. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Gibb S (2008) Toxicity testing in the 21st century: a vision and a strategy. Reprod Toxicol 25, 136–138 [DOI] [PubMed] [Google Scholar]
  • 58.Suzuki OT, et al. (2014) A cellular genetics approach identifies gene-drug interactions and pinpoints drug toxicity pathway nodes. Front Genet 5, 272. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Frick A, et al. (2015) Immune cell-based screening assay for response to anticancer agents: applications in pharmacogenomics. Pharmgenomics Pers Med 8, 81–98 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Zhang X, et al. (2011) In silico and in vitro pharmacogenetics: aldehyde oxidase rapidly metabolizes a p38 kinase inhibitor. Pharmacogenomics J 11, 15–24 [DOI] [PubMed] [Google Scholar]
  • 61.Frick A, et al. (2013) In vitro and in vivo mouse models for pharmacogenetic studies. Methods Mol Biol 1015, 263–278 [DOI] [PubMed] [Google Scholar]
  • 62.Bickle M (2010) The beautiful cell: high-content screening in drug discovery. Anal Bioanal Chem 398, 219–226 [DOI] [PubMed] [Google Scholar]
  • 63.Zanella F, et al. (2010) High content screening: seeing is believing. Trends Biotechnol 28, 237–245 [DOI] [PubMed] [Google Scholar]
  • 64.Mattiazzi Usaj M, et al. (2016) High-Content Screening for Quantitative Cell Biology. Trends Cell Biol 26, 598–611 [DOI] [PubMed] [Google Scholar]
  • 65.Persson M and Hornberg JJ (2016) Advances in Predictive Toxicology for Discovery Safety through High Content Screening. Chem Res Toxicol 29, 1998–2007 [DOI] [PubMed] [Google Scholar]
  • 66.Caicedo JC, et al. (2017) Data-analysis strategies for image-based cell profiling. Nat Methods 14, 849–863 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Bray MA, et al. (2016) Cell Painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nat Protoc 11, 1757–1774 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Gustafsdottir SM, et al. (2013) Multiplex cytological profiling assay to measure diverse cellular states. PLoS One 8, e80999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Jones TR, et al. (2009) Scoring diverse cellular morphologies in image-based screens with iterative feedback and machine learning. Proc Natl Acad Sci U S A 106, 1826–1831 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Albert FW and Kruglyak L (2015) The role of regulatory variation in complex traits and disease. Nat Rev Genet 16, 197–212 [DOI] [PubMed] [Google Scholar]
  • 71.Love MI, et al. (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Subramanian A, et al. (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102, 15545–15550 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Zyla J, et al. (2019) Gene set enrichment for reproducible science: comparison of CERNO and eight other algorithms. Bioinformatics 35, 5146–5154 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Ashburner M, et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25, 25–29 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Kanehisa M and Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28, 27–30 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Domaszewska T, et al. (2017) Concordant and discordant gene expression patterns in mouse strains identify best-fit animal model for human tuberculosis. Sci Rep 7, 12094. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Zhang B and Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 4, Article17. [DOI] [PubMed] [Google Scholar]
  • 78.Al-Barghouthi BM and Farber CR (2019) Dissecting the Genetics of Osteoporosis using Systems Approaches. Trends Genet 35, 55–67 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Keller MP, et al. (2018) Genetic Drivers of Pancreatic Islet Function. Genetics 209, 335–356 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Tyler AL, et al. (2017) Epistatic Networks Jointly Influence Phenotypes Related to Metabolic Disease and Gene Expression in Diversity Outbred Mice. Genetics 206, 621–639 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Keller MP, et al. (2019) Gene loci associated with insulin secretion in islets from non-diabetic mice. J Clin Invest 130, 4419–4432 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Komljenovic A, et al. (2019) Cross-species functional modules link proteostasis to human normal aging. PLoS Comput Biol 15, e1007162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Bubier JA, et al. (2017) Integrative Functional Genomics for Systems Genetics in GeneWeaver.org. Methods Mol Biol 1488, 131–152 [DOI] [PubMed] [Google Scholar]
  • 84.Chintalapudi SR, et al. (2017) Systems genetics identifies a role for Cacna2d1 regulation in elevated intraocular pressure and glaucoma susceptibility. Nat Commun 8, 1755. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Hernandez Cordero AI, et al. (2020) Genome-wide Associations Reveal Human-Mouse Genetic Convergence and Modifiers of Myogenesis, CPNE1 and STC2. Am J Hum Genet 106, 138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Mesner LD, et al. (2019) Mouse genome-wide association and systems genetics identifies Lhfp as a regulator of bone mass. PLoS Genet 15, e1008123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Quiros PM, et al. (2017) Multi-omics analysis identifies ATF4 as a key regulator of the mitochondrial stress response in mammals. J Cell Biol 216, 2027–2045 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Sorrentino V, et al. (2017) Enhancing mitochondrial proteostasis reduces amyloid-beta proteotoxicity. Nature 552, 187–193 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Ding H, et al. (2019) Biological process activity transformation of single cell gene expression for cross-species alignment. Nat Commun 10, 4899. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Durinck S, et al. (2009) Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat Protoc 4, 1184–1191 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Zerbino DR, et al. (2018) Ensembl 2018. Nucleic Acids Res 46, D754–D761 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Vilella AJ, et al. (2009) EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res 19, 327–335 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Gu Z, et al. (2014) circlize Implements and enhances circular visualization in R. Bioinformatics 30, 2811–2812 [DOI] [PubMed] [Google Scholar]
  • 94.Li MX, et al. (2011) GATES: a rapid and powerful gene-based association test using extended Simes procedure. Am J Hum Genet 88, 283–293 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Wang M, et al. (2017) COMBAT: A Combined Association Test for Genes Using Summary Statistics. Genetics 207, 883–891 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Wu C and Pan W (2018) Integration of Enhancer-Promoter Interactions with GWAS Summary Results Identifies Novel Schizophrenia-Associated Genes and Pathways. Genetics 209, 699–709 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Tyler AL, et al. (2019) Network-Based Functional Prediction Augments Genetic Association To Predict Candidate Genes for Histamine Hypersensitivity in Mice. G3 (Bethesda) 9, 4223–4233 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Ashbrook DG, et al. (2019) A Cross-Species Systems Genetics Analysis Links APBB1IP as a Candidate for Schizophrenia and Prepulse Inhibition. Front Behav Neurosci 13, 266. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 99.Wishart DS, et al. (2006) DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res 34, D668–672 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Cotto KC, et al. (2018) DGIdb 3.0: a redesign and expansion of the drug-gene interaction database. Nucleic Acids Res 46, D1068–D1073 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Cacheiro P, et al. (2019) New models for human disease from the International Mouse Phenotyping Consortium. Mamm Genome 30, 143–150 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Lanigan TM, et al. (2020) Principles of Genetic Engineering. Genes (Basel) 11 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Genomes Project, C., et al. (2015) A global reference for human genetic variation. Nature 526, 68–74 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Czechanski A, et al. (2014) Derivation and characterization of mouse embryonic stem cells from permissive and nonpermissive strains. Nat Protoc 9, 559–574 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Park S, et al. (2018) Genetic Regulation of Fibroblast Activation and Proliferation in Cardiac Fibrosis. Circulation 138, 1224–1235 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Morse HC 3rd, et al. (1979) Expression of xenotropic murine leukemia viruses as cell-surface gp70 in genetic crosses between strains DBA/2 and C57BL/6. J Exp Med 149, 1183–1196 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Peirce JL, et al. (2004) A new set of BXD recombinant inbred lines from advanced intercross populations in mice. BMC Genet 5, 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108.Churchill GA, et al. (2004) The Collaborative Cross, a community resource for the genetic analysis of complex traits. Nat Genet 36, 1133–1137 [DOI] [PubMed] [Google Scholar]
  • 109.Graham JB, et al. (2017) Extensive Homeostatic T Cell Phenotypic Variation within the Collaborative Cross. Cell Rep 21, 2313–2325 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Svenson KL, et al. (2012) High-resolution genetic mapping using the Mouse Diversity outbred population. Genetics 190, 437–447 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111.Bubier JA, et al. (2016) Cross-Species Integrative Functional Genomics in GeneWeaver Reveals a Role for Pafah1b1 in Altered Response to Alcohol. Front Behav Neurosci 10, 1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112.Richardson S, et al. (2016) Statistical Methods in Integrative Genomics. Annu Rev Stat Appl 3, 181–209 [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES