Skip to main content
The Plant Cell logoLink to The Plant Cell
. 2020 Nov 27;33(4):940–960. doi: 10.1093/plcell/koaa016

A systems genetics approach to deciphering the effect of dosage variation on leaf morphology in Populus

Héloïse Bastiaanse 1,2,, Isabelle M Henry 3,4, Helen Tsai 5,6, Meric Lieberman 7,8, Courtney Canning 9, Luca Comai 10,11, Andrew Groover 12,13,
PMCID: PMC8226299  PMID: 33793772

Abstract

Gene copy number variation is frequent in plant genomes of various species, but the impact of such gene dosage variation on morphological traits is poorly understood. We used a large population of Populus carrying genomically characterized insertions and deletions across the genome to systematically assay the effect of gene dosage variation on a suite of leaf morphology traits. A systems genetics approach was used to integrate insertion and deletion locations, leaf morphology phenotypes, gene expression, and transcriptional network data, to provide an overview of how gene dosage influences morphology. Dosage-sensitive genomic regions were identified that influenced individual or pleiotropic morphological traits. We also identified cis-expression quantitative trait loci (QTL) within these dosage QTL regions, a subset of which modulated trans-expression QTL as well. Integration of data types within a gene co-expression framework identified co-expressed gene modules that are dosage sensitive, enriched for dosage expression QTL, and associated with morphological traits. Functional description of these modules linked dosage-sensitive morphological variation to specific cellular processes, as well as candidate regulatory genes. Together, these results show that gene dosage variation can influence morphological variation through complex changes in gene expression, and suggest that frequently occurring gene dosage variation has the potential to likewise influence quantitative traits in nature.


A genome-wide screen in Populus reveals the effects of gene dosage variation on leaf morphological phenotypes.

Introduction

Gene dosage is a fundamental but only partially understood source of morphological variation in nature. Advances in sequencing technologies have demonstrated that structural variation and associated copy number variation (CNV) are prevalent in plant genomes (Lye and Purugganan, 2019), including in the model tree genus Populus, where 3,230 genes (about 7.9% of the gene models) displayed CNV among three species (Fehrmann et al., 2015; Pinosio et al., 2016). CNV affects the ratio of a gene’s copy number relative to background ploidy, and thus causes a disruption in the stoichiometric balance between genes. Instances of CNV and the associated variation in gene dosage have been associated with cancer in humans (Fehrmann et al., 2015), as well as with variation of numerous agricultural traits in plants and animals (Lye and Purugganan, 2019). However, there are few experimental systems capable of addressing the role of gene dosage variation on phenotypic variation genome-wide, leaving significant questions as to the general frequency and effects of gene dosage variation on traits, including morphological traits.

Theoretical considerations and experimental evidence have been used to develop mechanistic models of how gene dosage can affect phenotypes (Birchler and Veitia, 2012). An obvious potential effect is change in the relative expression of genes directly affected by dosage variation (i.e. cis or local effects) that in turn modulate phenotypes, although dosage compensation can buffer these effects (Song et al., 2020). For genes encoding regulatory proteins, dosage variation can indirectly modulate the expression of genes located outside of the indels as well (i.e. trans or distant effects) and result in complex regulatory events and outcomes. Mechanistically, dosage variation can also cause changes in the stoichiometric relationships within protein complexes, as shown for maize (Zea mays) aneuploids (Birchler and Newton, 1981), and are expected to have overall detrimental effects. However, basic questions including the frequency of dosage-sensitive genes and the response of transcriptional networks that ultimately connect dosage and phenotypic variation at a genome level have only recently become tractable in plants.

Differing experimental approaches have led to varied perspectives on the relative importance of gene dosage on morphology in plants. Historically, aneuploids, i.e. individuals carrying incomplete chromosome sets, have provided extreme examples of dosage variation and associated phenotypic consequences, including on leaf shape. For example, in the 1920s, Blakeslee (1921, 1922) identified and characterized a series of “chromosome mutants” in Datura, which later were identified as trisomics, i.e. each mutant was carrying an extra copy of one specific chromosome. These mutants exhibited variation in many phenotypic traits, including leaf size and shape (Blakeslee, 1922). Since then, trisomics have been described in several other plant species, and in all cases exhibited phenotypes specific to the identity of the triplicated chromosome (Blakeslee, 1922; Koornneef and Van der Veen, 1983; Henry et al., 2010; Singh, 2016). In Arabidopsis thaliana, some leaf traits have been characterized in aneuploids, and linked to dosage variation of specific regions of the genome (Henry et al., 2010). Single gene-level views of the effect of dosage variation are provided by analyses based on extensive loss of function mutagenesis in Arabidopsis. Loss of function phenotypes were found for 2,400 genes (about 9% of Arabidopsis genes), for which approximately a third conditioned morphological phenotypes (Lloyd and Meinke, 2012). However, these analyses found very few examples of phenotypic consequences attributed to haploinsufficiency, which could indicate that plants are relatively tolerant to changes in gene dosage (Meinke, 2013), or that more subtle phenotypes in heterozygous individuals may be underreported. On the other hand, classical quantitative genetic analysis can attribute variation in quantitative phenotypes to allelic variation, including CNV (Stranger et al., 2007).

From an evolutionary perspective, polyploidy and genome duplication have shaped genomes in angiosperm lineages, resulting in frequent CNV and structural variation (Swanson-Wagner et al., 2010; Amborella Genome Project, 2013; Causse et al., 2013). How such variation affects phenotypes in plants is currently being addressed (Lye and Purugganan, 2019; Soyk et al., 2019). In Arabidopsis, frequent CNV was found in a survey of more than 1,000 Arabidopsis accessions, with evidence of selection on CNV and associations between CNV and disease response traits in proof of concept analyses (Zmienko et al., 2020). In tomato (Solanum spp.), long-read nanopore sequencing revealed over 230,000 structural variants across 100 lines (Alonge et al., 2020). Variants were identified affecting gene dosage that impacted multiple quantitative domestication traits associated with fruit size, flavor, and yield. These results underscore the potential for previously transparent structural variation and CNV to play fundamental roles in quantitative trait variation in other species, including forest trees.

System genetics approaches integrate genomic, gene expression, and phenotypic data in a genetic framework to describe variability for traits at scales ranging from genome-wide to candidate gene identification (Nadeau and Dudley, 2011; Lye and Purugganan, 2019). The general approach links primary genetic variation at the DNA sequence level and intermediate molecular phenotypes, such as transcript, protein, or metabolite abundance, to ultimately describe the molecular determinants underlying complex traits (Civelek and Lusis, 2014; Moreno-Moral et al., 2017). By integrating multiple layers of biological information, systems genetics approaches have been successful in bridging DNA variations across individuals of a population with higher levels of regulation including the gene regulatory networks and molecular pathways that ultimately affect complex phenotypes (Nadeau and Dudley, 2011). At finer levels, systems genetics has proven useful in the discovery of master trans-acting genetic regulators, which in some cases revealed unexpected gene targets influencing traits (Nadeau and Dudley, 2011; Moreno-Moral and Petretto, 2016).

Populus is an ideal model to study the relationships between gene dosage variation, gene expression, and morphological variation using system genetics. A high quality reference genome is available (Tuskan et al., 2006), and Populus genotypes with structural variation that could not be transmitted through meiosis can be vegetatively propagated and immortalized. In the previous work, we developed a large collection of clonally propagated F1 Populus nigra × Populus deltoides hybrid lines that collectively saturate the genome 10-fold with experimentally induced deletions and insertions (indels) of known sizes and positions (Henry et al., 2015). Approximately 58% of lines carried at least one indel mutation. Indel mutations were represented by deletions (71%) and insertions (29%) originating from the male P. nigra and ranged from 0.3 Mbp to entire chromosomes in size. Using this resource, we investigated the impact of gene dosage variation on Populus biomass and phenology-related phenotypic traits and identified large numbers of dosage sensitive loci affecting growth and phenology (Bastiaanse et al., 2019). However, linkages between gene dosage, gene expression, and phenotypic variation were not addressed.

Leaf shape is genetically controlled and has been highly modified through evolution, resulting in striking morphological variation among and within diverse taxa (Chitwood and Sinha, 2016). Here, we took a systems genetic approach to integrate data from dosage variation, gene expression, co-expression networks, and leaf phenotypes, to characterize the effect of dosage variation on leaf morphology in our Populus indel population. We show that dosage variation has significant and frequent impacts on leaf morphology, and that these impacts can be linked to variation in transcript levels of both genes with varying dosage (cis effects), as well as genes located elsewhere in the genome (trans effects). We summarize these complex changes in gene expression within a co-expression framework to identify both broader mechanisms affected by dosage that in turn influence leaf morphology, as well as individual candidate genes.

Results

Our overall strategy for linking gene dosage variation, gene expression, and leaf morphological variation is conceptually summarized in Figure 1. A minimum of three clonal replicates (ramets) of over 600 Populus indel lines were characterized for leaf morphometric traits including leaf shape, leaf size, stomatal patterning, and serration characteristics (Figure 1, A). Expression levels of individual genes were assessed by sequencing the leaf transcriptomes of 165 of these lines, and these data were used in integrated systems genetics analyses. First, whole genome scans identified regions of the genome where dosage variation correlated with phenotypic variation (phenotype dosage quantitative trait loci QTL [dQTL], Figure 1, B). Second, RNA-sequencing was used to describe the effects of dosage in cis (direct response of genes when their dosage is changed) and in trans (indirect response of genes elsewhere in the genome) on the expression of individual genes (expression dQTL, Figure 1, C left), as well as to identify differentially expressed genes in pools of lines with extreme phenotypes. This list of genes was subsequently limited to the genes physically located under the phenotype dQTL bins (differential expression analysis, Figure 1, C right). Third, co-expression network analyses were used to summarize modules of co-expressed genes across the various lines. Modules of co-expressed genes were functionally annotated for the biological functions they encode and their correlations with the phenotypic traits (Figure 1, D). Lastly, to better understand the complex interactions at play between genes of interest and their functional role to the phenotypic trait variation, we evaluated them within the broader context of the gene co-expression networks (Figure 1, E). Overall, these analyses provide mechanistic connections between gene dosage, expression of individual genes and groups of co-expressed genes, and Populus leaf morphometric traits, as described below.

Figure 1.

Figure 1

Overview of the analytic workflow used to investigate the genomic architecture of indel mutation on individual gene expression and co-expression networks. (A) Populus indel lines were characterized for various leaf morphometric traits, genomically characterized for indel positions, and assayed for gene expression in leaf tissues. (B) Genetic correlations were used to identify chromosomal bins in which gene dosage variation was proportional to the phenotypic values (phenotype dQTL). (C) Gene expression correlations were used to identify the genes for which expression was proportional to gene dosage at individual phenotype dQTL bins (expression dQTL). Distinction was made between local (cis-), and distant (trans-) regulation based on the physical position of the gene relative to the chromosomal bin evaluated. In addition, a differential expression analysis between extreme phenotypic mutants was performed to identify candidate genes within the phenotype dQTL bins. (D) Co-expression network analysis: modules of genes that were co-expressed across the various lines were identified. Such modules were then related to the phenotypic traits by evaluating the level of correlation of the module eigengene expression with trait variation, and to biological function using a gene enrichment analysis. (E) All data were integrated to reveal the genomic architecture of the regions and genes regulating leaf morphometric traits. The interactions that are at play between expression dQTL and differentially expressed genes, and their importance toward the phenotype and global biochemical pathways they encode, was further evaluated in the broader context of the gene co-expression network. Candidate genes were mapped onto the network modules, and further evaluated for their importance as key regulator of the module (module membership) and gene significance to the phenotypic trait.

Global leaf morphometric analysis of the indel population

Clonal replicates of individual F1 irradiation insertion–deletion lines were phenotyped for leaf morphology traits in an experimental field over two consecutive years (“Materials and methods” section). The outline of the first fully mature leaf subtending the apex of each tree was extracted from images and used to compute various descriptors of leaf size, including the area, perimeter, length, and leaf width (“Materials and methods” section). Leaf shape descriptors were calculated including leaf circularity, and descriptors of the vertical and horizontal asymmetry consisting of width and length ratios taken at 25% and 75% of the blade width or length. Additional characters measured included descriptors of stomatal patterning (stomata density and clustering) and leaf serration measurements (indents number, width, and depth).

Additionally, an analysis of leaf shape independent of size was performed using an elliptic Fourier series approach (“Materials and methods” section). For this analysis, a principal component analysis (PCA) was applied to elliptic Fourier harmonic coefficients calculated for each leaf shape, to provide quantitative variables (orthogonal and uncorrelated) that represent morphological components of the original shape. The first four principal components (PCs) explained 49.7%, 31.9%, 6.7%, and 5.1% of the symmetrical variance of leaf shape, respectively, and described distinct aspects of shape (Figure 2). For instance, while PC1 values at either extreme were associated with round or elongated lamina, PC2 values captured leaf tip characteristics. Together, the four PCs explained a total of 93.4% of all symmetrical shape variance measured.

Figure 2.

Figure 2

PCA of the elliptical Fourier descriptors of symmetrical shape variation within the indel poplar population. Each row represents the mean leaf shape at −2 to +2 standard deviation (SD) along the first four PC axes, as well as the percentage of the variation explained by each PC.

The leaf shape of most indel lines was similar to those of their siblings lacking indels, but a subset of indel lines produced leaves with atypical shapes and sizes (Figure 3, A). To systematically quantify and visualize this diversity, a K-means cluster analysis was applied to the PC1 and PC2 values of the Fourier coefficients for indel and non-indel lines, their parental lines (female P. deltoides and male P. nigra), as well as leaf shapes from a diversity of related species of Salix and Populus (the two genera encompassing Salicaceae) to serve as reference for visual comparison. Leaf shapes fell into five clusters within PC space (Figure 3, B). The majority of lines clustered either with the P. deltoides (red cluster) or the P. nigra (blue cluster) parental lines, or were intermediate between these two groups (yellow cluster) and included leaf shapes similar to Populus x canescens, Populus tremuloides, and Populus tremula. The last two K-means clusters departed from these three groups. In the orange cluster, leaf shapes of indel lines overlapped with species Salix cinerea, Salix caprea, and Populus maximowiczii, as well as one of the two shapes of the heteroblastic desert poplar Populus euphratica. Finally, lines with some of the most extreme leaf shapes were observed in the purple cluster that contained narrow and oblong shapes that overlapped with Populus balsamifera, Populus simonii, P. euphratica, Populus lasiocarpa, Populus trichocarpa, and Salix pentandra.

Figure 3.

Figure 3

Overview of the diversity in leaf shape and leaf size within the indel mutant population. (A) Striking differences in leaf shape and leaf size were observed among the lesion lines. Control non-lesion lines were overall more similar to the left-most photograph. (B) K-means clustering of the first and second PCs of the symmetric elliptical Fourier quantitative variables. Analyses were performed on leaf outlines from individual ramets of the indel mutant population, their parental lines female P. deltoides and male P. nigra, as well as some representative species of the Populus and Salix genera. For individuals from the indel population, each of the five K-means clusters was plotted in different colors. Black dots were used for the other representative species. The mean leaf outline of each cluster is represented in the center of its corresponding cluster. Leaf outlines from the other species are represented on the side of their corresponding cluster.

Enrichment or depletion of indel versus control non-indel lines was determined for each of the K-means clusters using chi-square analysis (Supplemental Dataset S1). Interestingly, indel lines were significantly overrepresented in the red cluster containing the maternal line P. deltoides, but significantly depleted in the blue cluster containing the gamma irradiated male parent P. nigra. Since indel lines harbor ∼2× more deletions than insertions, the loss of P nigra alleles may have caused the indel lines within the red cluster to be more similar to their female parent (P. deltoides). Inversely, the blue cluster showed overrepresentation of control non-lesion lines (lacking indels) whose leaf shapes were more similar to the P. nigra parent. A significant overrepresentation of lesion lines was also detected in the purple cluster containing the most extreme leaf shapes.

Morphometric traits are correlated and influenced by the presence of indels

Correlations were calculated among all of the global leaf shape descriptors to further interpret the distinct elements of leaf shapes described by each of the Fourier PCs (Supplemental Figure S1). Fourier PC1 was positively correlated with the circularity index, and negatively correlated with the horizontal symmetry and length-to-width ratio. Fourier PC2 was positively correlated with the perimeter-to-area ratio and the circularity index, and negatively correlated with the length-to-width ratio. Fourier PC3 was negatively correlated with the perimeter-to-leaf area ratio and the circularity index. Finally, Fourier PC4 was negatively correlated with the horizontal symmetry, as well as with the length-to-width ratio. Fourier PC1 and PC2 were positively correlated with leaf size descriptors, while Fourier PC3 and PC4 were negatively correlated with them.

Leaf shape and size were not completely uncoupled in our poplar population. Comparison of the lines grouped into each of the five clusters presented in Figure 3, B revealed that both orange and purple clusters, which contained lines with some of the most extreme leaf shapes, also presented significant differences (P < 0.001) in leaf size (Supplemental Figure S2). Abaxial versus adaxial stomatal patterning showed strong correlations (r2 density 0.7; distance 0.8, clustering index 0.5), while only weak (correlation ≤0.3) correlations were found between stomatal patterning descriptors and leaf morphometry (e.g. leaf shape, size, and serration). For leaf serration measures, the strongest correlations (up to 0.9) were seen with leaf size, as well as with the length-to-width ratio. Finally, strong correlations were found between leaf size and tree biomass (including tree height, weight, and volume), similar to previous findings in poplar (Verlinden et al., 2013).

To test the influence of gene dosage on the morphometric traits, we compared the distributions of trait values between the indel lines (lines carrying at least one indel) and control non-indel lines (control 0 gray and 100 gray-irradiated lines with no detectable indel). Here, the effect of indel mutations was analyzed collectively, independently of size or genomic location of the individual indels or lines. Continuous unimodal distributions were observed in both indel and control non-indel lines (“Materials and methods” section; Supplemental File S1). For the majority of the traits (80%), analysis of variance (ANOVA) demonstrated significant differences (P < 0.05; Supplemental File S1) in mean phenotypic trait values between the two groups. Significant differences were observed for all combinations of the first four PCs of the elliptic Fourier descriptors as well as for some individual PCs of these same descriptors, and for vertical leaf symmetry and length-to-width ratio. Leaf area, length, width, perimeter, and serrations were significantly smaller among the indel lines. Abaxial and adaxial stomata density was smaller in indel lines, with increased mean nearest neighbor distance (NND). The stomata clustering index of the indel and control non-indel lines was similar at the abaxial surface of the leaf but slightly larger (0.01 < P-value < 0.05) at the adaxial surface of the leaf. Overall, when examining the range of phenotypic distributions, extreme phenotypes were more frequently observed among the indel lines than in the control lines, and the indel lines exhibited an overall wider phenotypic distribution for multiple traits.

Leaf morphometry traits are heritable and correlate with gene dosage at specific chromosomal bins

Next, we examined the genetic architecture of morphology traits relative to dosage. At a gross phenotypic level, traits showed modest broad-sense heritabilities (H2) ranging from 0.2 < H2 < 0.4 (Supplemental Figure S3 and Supplemental Dataset S2). To determine if specific dosage-sensitive regions of the genome affected morphology traits, genome scans were used to examine correlation of the best linear unbiased predictor (BLUP) for each phenotypic trait with the relative gene dosage ratio at individual chromosomal bins across the indel population, while controlling for false discovery (“Materials and methods” section; Bastiaanse et al. 2019). Bins with significant correlations were referred to as phenotype dQTL (Figure 1, B). This approach is illustrated by the comparison of mean leaf shape observed between deletion, insertion, and non-indel lines underlying phenotype dQTL bins for the Fourier PC1:PC3 trait in Figure 4. Each phenotype dQTL is defined by both location and specific effects of modulation of relative gene dosage on leaf shape. The effect of modulating gene dosage at each of the six significant phenotype dQTL for this trait on leaf shape is contrasted in the individual plots in Figure 4.

Figure 4.

Figure 4

Effect of dosage variation of specific genomic bins on leaf shape. Manhattan plots showing the effect of dosage variation on leaf shape for genomic bins for which dosage was associated with the Fourier PC1:PC3 traits (phenotype dQTL). For each phenotype dQTL, start and end positions are indicated. Some of these chromosomal bins overlapped with phenotype dQTL controlling other traits that could also contribute to the mean leaf shape of deletion (purple), insertion (green), and no indel lines (black) represented here. Colors in the Manhattan plots and in titles of the mean leaf shape plots refer to associations found with various poplar chromosomes.

Globally, phenotype dQTL were identified for 67 of 72 leaf morphometric traits (Supplemental Dataset S3) and were located on 17 of the 19 Populus chromosomes. Traits for which no significant associations were found with dosage were Vertical_symmetry_y2, Vertical_symmetry_y1_y2, Fourier_PC2:PC4_y2, Fourier_PC3:PC4_y1, and Indent_depth_y1. Co-localization among some phenotype dQTL for multiple traits is consistent with a pleiotropic effect of dosage variation for some chromosomal bins, as shown for genomic regions located on chromosomes 1, 2, 14, and 19 influencing all categories of leaf morphometric traits (Figure 5, A), in addition to tree biomass-related traits previously described (Bastiaanse et al., 2019). Interestingly, those genomic regions on chromosomes 1, 14, and 19 that control many of our leaf shape and biomass-related traits, also overlapped with regions described in other genetic association studies conducted in poplar: chromosome 1 was associated with a major QTL controlling a variety of leaf shape characteristics (Xia et al., 2018), and chromosomes 14 and 19 were also identified as hotspots for biomass-related traits in poplar (Rae et al., 2009). In contrast, other chromosomal bins were unique to morphometry trait categories, including bins located on chromosome 17 influencing leaf shape-related traits, and chromosome 18 influencing stomata-related traits only. In general, leaf shape-related dQTL colocalized more often with leaf size dQTL, while regions specific to stomatal patterning traits were more frequently observed. When data from multiple years of observations were available for the same trait, the phenotype dQTL obtained were often similar across years (Supplemental File S2). The majority of the phenotypic trait variation was shown to be governed by many loci (7 SD5 loci, and up to 17 loci) of modest effect in terms of percentage of variance explained (3.1 SE 0.1; Supplemental Dataset S3). For simplicity, only phenotypic values from the combined years are discussed below.

Figure 5.

Figure 5

dQTL identified in the indel mutant lines. (A) Position and size of the phenotype dQTL along the 19 chromosomes of poplar. Phenotypes consisted in the leaf morphometric traits, divided in four categories: leaf shape (dark blue), leaf size (green), stomata patterning (purple), and serration (red). In addition to these leaf morphometric traits, we added the size and position of tree biomass-related traits (orange) identified in the same population and published in Bastiaanse et al. (2019). Specifically, from top to bottom each row represents the size and position of phenotype dQTL corresponding to: Tree_weight_y2, tree_volume_y2, tree_height_y2, Indent_depth_y1_y2, Indent_width_y1_y2, Indent_number_y1_y2, Abaxial_stomata_cluster, Abaxial_stomata_distance, Abaxial_stomata_density, Perimeter2:area2_y1_y2, Length_y1_y2, Length_y1_y2, Width_y1_y2, Perimeter_y1_y2, Area_y1_y2, Horizontal_symmetry_y1_y2, Length:width_y1_y2, Circularity_y1_y2, Fourier_PC3:PC4_y1_y2, Fourier_PC1:PC3_y1_y2, Fourier_PC1:PC2_y1_y2, Fourier_PC4_y1_y2, Fourier_PC3_y1_y2, Fourier_PC2_y1_y2, and Fourier_PC1_y1_y2. (B) Heatmap of the number of indel mutation underlying each dosage chromosomal bins in the subset of ∼165 lines of the population that was selected for the transcriptome analyses. (C) Frequency of occurrence of trans-expression dQTL along the dosage chromosomal bins. The dashed line indicates the significant threshold above which we can declare significant colocalization of trans-expression dQTL, as determined by 1,000 random permutations of the size and position of the expression dQTL revealed in the population. (D) Heatmap of the number of indel mutations underlying each chromosomal bins in the ∼650 lines of the full mutant population that was used to characterize the phenotype dQTL. (E) Position and size of the expression dQTL. The x-axis represents the position of the expression dQTL and the y-axis the physical location of each gene on the P. trichocarpa reference genome. Each dot represents a significant association between gene expression and gene dosage variation at individual chromosomal bins. Dark and light pink indicate significant associations at Kendall correlation Padjust <0.001, and <0.01, respectively. Cis-expression dQTL are present over the diagonal of the graph and correspond to association between dosage chromosomal bins and genes that are physically located under the corresponding bin. Trans-expression dQTL are departing from the diagonal, and are located outside the squared pattern laying along this diagonal as such pattern is due to the correlated nature of our dosage chromosomal bins. Gray shading represents region of the genome for which we did not have the statistical power to assess the expression dQTL (presence of less than five mutants having insertion or deletion in the particular genomic bin).

Leaf gene expression levels are heritable and correlate with gene dosage variation at chromosomal bins

To investigate the effect of gene dosage variation on gene expression and ultimately on leaf morphometric traits, we sequenced the transcriptomes of leaves of the same developmental stage from 475 ramets, corresponding to 164 unique lines (“Materials and methods” section). Because a single developmental timepoint was sampled, the results presented here likely omit linkages between gene expression at other timepoints and final leaf morphological phenotypes. The best linear unbiased estimator (BLUE) of individual genes was calculated across clonal ramets of the same line, and individual gene expression was reasonably heritable (average H2 = 0.38 ± 0.14; Supplemental Dataset S4). The correlation between relative gene dosage ratio at chromosomal bins and transcript levels of individual genes was determined (“Materials and methods” section), and significant gene expression–bin associations were identified as expression dQTL. Among the 164 lines analyzed, 143 lines contained indels that cumulatively covered 92.4% of the genome with at least one indel, corresponding to 452 dosage bins with an average indel genome coverage of 5.3× (Supplemental Figure S4). Significant associations (Padjust <0.01) were only detectable for chromosomal bins defined by at least five mutant lines, including 239 bins located on 17/19 chromosomes, which cumulatively covered 45.4% of the genome.

Dosage-responsive genes were categorized as local (cis-) and distant (trans-) expression dQTL (Figure 1, C). The cis-expression dQTL reflects direct dosage effects on the expression of a gene physically located in the copy-variant DNA, while trans-expression dQTL reflects the indirect effect of local dosage variation on genes located elsewhere in the genome. Because lesions overlap across multiple contiguous bins (Supplemental Figure S5), cis-correlations were found to propagate over those bins and could be mistaken with trans-expression regulation. We thus defined trans-expression dQTL genes as excluding genes located in bins correlated and contiguous with the cis-expression QTL under study. A heatmap summarizing the statistical significance and genomic locations of all cis- and trans-expression dQTL is presented in Figure 5, E. All 6,890 cis-expression dQTL can be visualized along the diagonal of the plot. The 2,554 trans-expression dQTL associations (corresponding to 1,088 genes) depart vertically above and below the diagonal.

Trans-expression dQTL were not uniformly distributed along the genome, with some of the dosage chromosomal bins generating up to 103 trans-expression dQTL (Figure 5, C). A permutation test determined that unusually large numbers of trans-expression dQTL were associated with dosage variation on chromosomes 1, 5, 9, 14, 17, and 19 (“Materials and methods” section). About half of these large trans-expression dQTL regions (on chromosomes 1, 9, and 14) co-localized with a large number of phenotype dQTL controlling leaf size and shape, serration, stomatal patterning, and tree biomass. In contrast, regions associated with large numbers of trans-expression dQTL associated with regions on chromosomes 5, 17, and 19 did not co-localize with any of the phenotype dQTL, indicating modulation of gene expression that did not significantly impact the morphological phenotypes assayed.

Gene modules within transcriptional co-expression networks are associated with morphological trait variation

To summarize the complex effects of dosage variation across the genome on the global gene expression and its ultimate impact on leaf phenotypes, a weighted gene co-expression network analysis (WGCNA) was used to identify and functionally characterize modules of genes that are co-expressed among the indel lines assayed (Figure 1, D). Distinct biological pathways impacting leaf morphological phenotypes can then potentially be inferred based on functional features of co-expression modules. A gene co-expression regulatory network was calculated that excluded the direct effects of dosage on genes in cis- (“Materials and methods” section), and identified 19 modules of co-expressed genes (excluding the gray module of unassigned genes) containing between 21 and 8,258 genes each (Supplemental Figure S6). A first functional annotation of the modules tested the Pearson correlation of each module eigengene expression to individual leaf morphometry traits (“Materials and methods” section; Figure 6, left panel). Some modules of eigengene expression correlated to only a few of the leaf trait categories, such as the brown module correlating to stomata- and serration-related traits only. But the majority of the modules correlated with multiple trait categories. Four modules (modules steelblue, sienna3, dark magenta, and dark green) did not correlate with any of the phenotypic traits under investigation. Next, the primary biological functions associated with each co-expression module were estimated using a Gene Ontology (GO) analysis of relevant Biological Process categories (Figure 6, right panel). From 15 to 291 GO terms were found to be significantly enriched per co-expression module, when compared with the GO functions associated with all genes expressed in leaf tissue (Supplemental Dataset S5). This analysis revealed enrichment of leaf trait modules for GO terms describing cell differentiation and cell division, cell wall organization, the establishment of tissue polarity, and hormone signaling.

Figure 6.

Figure 6

Relationship between module expression and trait values. Module to phenotypic trait relationship is measured by the P-value of the Pearson correlation between the modules’ eigengene expression and the phenotypic BLUP values. Rows were ordered according to a hierarchical clustering of the eigengene expression of each module across the tree lines, as represented by the hierarchical tree clustering on the far left. X symbols indicate occurrence of colocalization between the phenotype-dQTL and the module-dQTL. Right: Module to GO relationship as measured by the P-value of the GO enrichment. In each cell, we indicated the number of genes belonging to children GO terms showing significant enrichment in the various modules, when compared with all the genes expressed in leaves. Color, from light orange to dark red indicates significance levels at P < 0.05, P < 0.01, and P < 0.001.

Differential gene expression analysis of lines with extreme phenotypes evaluates candidate genes independently of dosage

We performed a differential gene expression analysis between pools of lines exhibiting extreme phenotypes to provide further insight into the mechanisms influencing leaf shape, and additional criteria for the identification of candidate genes controlling leaf morphometric traits (“Materials and methods” section; Figure 1, C). The goal of this analysis was to assess the correlation of individual candidate genes underlying the phenotype dQTL on traits, independently of direct (-cis) gene dosage effects on each candidate. This analysis revealed a significant regulation for an average of ∼41 genes (range 1–316) per phenotype dQTL analyzed (Supplemental Dataset S6). For nine traits however, no evidence for differential expression was found for any of the genes underlying the phenotype dQTL bins: Fourier_PC3_y1_y2, Fourier_PC4_y1_y2, Fourier_PC2:PC4_y1_y2, Abaxial_stomata_density, Adaxial_stomata_density, Abaxial_stomata_distance, Adaxial_stomata_distance, Abaxial_stomata_cluster, and Adaxial_stomata_cluster (Supplemental Dataset S7). Interestingly, the list of differentially expressed genes was enriched with transcription factors, in contrast to cis-expression dQTL-associated genes, which were depleted in transcription factor-encoding genes, and trans-expression dQTL-associated genes, which did not show any significant enrichment of transcription factor-encoding genes (Supplemental Figure S7).

Phenotypic trait prediction is improved by integration of multiple data types

Next, we explored the complex relationships among variation in dosage, individual gene expression, transcriptional networks, and morphometric traits. Four additive linear models (“Materials and methods” section) were contrasted for their ability to predict phenotypes. The first model was based on the additive effects of relative gene dosage variations at the phenotype dQTL bins. The second model used the differentially expressed genes underlying these phenotype dQTL bins in lines with extreme phenotypes. The third model used the eigengene values of co-expression modules that were significantly associated with the phenotypic traits (Pearson correlation P-value < 0.05). Additional models used a combination of pairs of two, or all three data types: gene dosage variation, differentially expressed genes, and co-expression module eigengenes. Overall, the full model fitting of all three data types was more successful at explaining the observed phenotypic variance, than any of the models including only one or two data types (Figure 7, Supplemental Dataset S8, and Supplemental Figure S8). Specifically, an average of 43.3% ± 17.1% of the variance was explained by the full model when compared with 12.7 ± 7.9, 23.2 ± 12.9, and 21.0 ± 19.3 for single-data models using gene dosage, differentially expressed genes, and module eigengene, respectively. The performance of models using two of the three datatypes was intermediate (Supplemental Figure S8).

Figure 7.

Figure 7

Prediction of phenotypic variation, based on the various analysis approaches. Phenotypic variance prediction, in term of percentage of variance explained, by additive linear models combining relative gene dosage under phenotype dQTL bins, eigengene expression of differentially expressed genes under the phenotype dQTL bins in extreme phenotypic mutants, eigengene expression of WGCNA modules correlated with the phenotypic traits (at Pearson P < 0.05), and a combination of three data types. Each boxplot shows the mean and SD of the phenotypic variance explained across all morphometric traits. Different letters indicate statistically different means at the 5% significance level, according to Tukey’s test.

Systems genetic approach provides finer resolution and prioritizes candidate genes underlying the morphometric trait variation

To provide finer scale resolution insights into regulatory networks and prioritize potential candidate genes influencing phenotypic trait variation, genes were evaluated using multiple criteria, including dosage-sensitive expression, position and connectivity within co-expression networks, correlation with phenotypic traits, and functional annotations. Specifically, the cis- and trans-expression dQTL genes, as well as the differentially expressed genes associated with the phenotype dQTL bins were mapped onto the gene co-expression network. Then for each candidate gene, correlations with the eigenegene for the module (module membership) and the phenotypic trait (gene significance) were determined. Genes that were both highly connected to the other genes within a co-expression module (as expected for a potential regulator of other genes within the module), and highly correlated with the phenotypic trait (high gene significance) were prioritized in the candidate gene list, and their potential biological roles evaluated based on functional annotations (Figure 1, E).

Across all phenotypes, 2,560 cis- and 533 trans-expression dQTL genes, and 1,736 genes differentially expressed among lines with extreme phenotypes were associated with one or more phenotype dQTL bins. In addition, gene co-expression analysis revealed the presence of 5,053 genes within the upper quartiles of both module membership and gene significance for modules correlated with leaf traits (“WGCNA quartile”; Figure 8, left and Supplemental Dataset S9). Genes commonly identified through expression dQTL, differential expression, and gene co-expression network analyses were used to tabulate prioritized candidate gene lists influencing traits: 917 genes were confirmed by at least two different analyses for their involvement in leaf morphometry, while 116 genes were identified by three of these analyses. On average, among all phenotype dQTL, this represented a list of about 14 prioritized genes per phenotype dQTL among the average of 258 genes annotated in the phenotype dQTL intervals (Supplemental Dataset S10). This prioritized list of candidate genes was enriched in particular classes of transcription factors (Figure 8, right) that could be further evaluated as candidates for regulating expression of genes close by within co-expression networks.

Figure 8.

Figure 8

Identification of candidate genes. (A) Number of candidate genes identified by the various approaches across all morphometric traits. The “cis-” and “trans-expression dQTL” categories represent the expression dQTL genes associated locally or distantly with the phenotype dQTL bins, respectively. The “DE genes” category includes the genes differentially expressed between extreme phenotypic mutant pools and that are physically located under the phenotype dQTL bins. The “WGCNA quartile” category refers to the genes located above the 75th percentile both in terms of module membership and gene significance (Figure 1, E). Module membership is a measure of the individual gene correlation to the module eigengene expression, and is also representative of the gene modular connectivity. Gene significance represents the correlation between the gene expression pattern and the phenotypic trait variation. We only included the modules that were correlated with the traits. Barplot shows how many genes were identified by each approach, and also at the intersection of two, or more datasets (two or three lines of evidence), to obtain a prioritized candidate gene list. (B) Enrichment of transcription factor families in the prioritized candidate gene list (N = 1,033 genes having two or three lines of evidence), compared with the full list of genes assayed in the leaf tissue. * indicates enrichment evaluated at P < 0.05 by Fisher exact test.

To further illustrate the advantages of data integration at a finer scale, we dissected a pleiotropic phenotypic dQTL located on chromosome 14, which influences leaf shape trait Fourier PC3:PC4 (corresponding to the ratio of the third and fourth dimensions PCs of the morphometric Fourier variables), but also other traits associated with leaf shape, leaf size, stomata, and serration-related traits (Figure 9, A). The two sets of analysis, the cis-expression dQTL, and the differential expression among extreme phenotypic mutants, revealed partially overlapping sets of candidate genes (Figure 9, C). Next, to further investigate the relationship between modules and potential candidate genes, we characterized theses classes of genes (i.e., cis-, trans-expression dQTL, differentially expressed genes) in terms of relative module membership and gene significance for the leaf shape trait Fourier PC3:PC4 (Figure 9, D). A greater proportion of the genes differentially expressed between lines with extreme phenotypes was found to map in the upper right quartile, followed by the cis- and the trans-expression dQTL genes (34%, 16%, and 10%, respectively). The genes differentially expressed between lines with extreme phenotypes were enriched in the upper WGCNA quartile (pairwise comparisons, all Fisher exact test P-value <0.001). This group of genes was also enriched in transcription factors, suggesting that they might contribute to the regulation of these specific modules. The lower proportion of cis-expression dQTL genes in the WGCNA quartile may reflect effective exclusion of dosage-sensitive genes within the dQTL phenotypic bins that are not directly associated with leaf phenotypic trait variation. Finally, the majority of the trans-regulated genes were found to have lower module membership and lower gene significance. Such topological properties of the three classes of genes are further illustrated in Figure 9, E for two example modules violet and turquoise. While the majority of the differentially expressed genes are found toward the center of the network, trans-expression dQTL genes were preferentially found at the periphery of the network corresponding to low module membership, and low intra-modular connectivity.

Figure 9.

Figure 9

Illustration of the system genetics approach to prioritize candidate genes associated with a particular phenotype dQTL. Candidate genes associated with one particular phenotype dQTL are presented. (A) This phenotype dQTL is located on chromosome 14 and was also found to control various phenotypic traits, including the leaf shape trait Fourier_PC3:PC4. (B) and (C) We present the repartition of the candidate genes in the various categories of the Venn diagram, as well as (D) their locations in the gene significance toward Fourier_PC3:PC4 versus module membership scatter plot for modules correlated with the trait. Triangles in the scatterplots represent transcription factors. Color coding is the same as in the Venn diagram. (E) Detailed network representation of the candidate genes mapping in the violet and turquoise co-expression modules. Nodes represent the genes, and edges represent the interaction between the genes. Edge thickness is proportional to the interaction weight between gene pairs. Size of the nodes is proportional to the module membership value of the gene. Genes with high module membership tend to plot at the center of the network, because they interact with many other genes composing the module (red box). Genes with low module membership tend to plot at the periphery of the network. Network was plotted using cytoscape 3.8.0 and the Prefused Force Directed Layout.

Functional annotations of the prioritized gene lists from the analyses above overlay finer resolution biological interpretations of the leaf shape trait Fourier PC3:PC4 (Figure 9, D). For instance, module “turquoise” is a large module containing 8,258 genes and associated with multiple phenotypic trait categories and gene ontologies (Figure 6). However, the integration of the multiple analyses described above with functional annotations provides a way to identify potential causal genes. A modest number of prioritized genes included YABBY (YAB1, Potri.014G066700.v3.1) and growth regulating factor (GRF) transcription factor families (GRF9, Potri.014G071800.v3.1), as well as numerous genes encoding proteins involved in microtubule cytoskeleton organization (MAP65-3 Potri.014G070100.v3.1; IQD17 Potri.014G104600.v3.1) and microtubule binding (ADL1E Potri.014G043600.v3.1). Another module, “violet,” was enriched with genes involved in the hormone signaling pathways (Figure 6), and prioritizes genes belonging to the NAC (NAC028 Potri.014G041300.v3.1) and WRKY families (WRKY53 Potri.014G096200.v3.1; Figure 9, A), members of which are involved in a variety of hormone signaling cascades during development and environmental stresses (Xie et al., 2014). The approaches illustrated here could be used similarly to explore and summarize other traits and biological pathways involved in leaf morphometry (Figures 6, 9).

Discussion

In the work presented here, we used an integrative systems genetics approach to understand the complex influence of gene dosage on leaf morphology in a unique, clonally replicated poplar interspecific F1 hybrid population carrying insertions and deletions tiling the genome (Henry et al., 2015). Importantly, the high frequency of induced dosage variation in the study population enabled analyses linking dosage, gene expression, and phenotypic trait variation that would be difficult in populations with lower frequencies of naturally occurring structural and CNV. A primary feature of our approach was the integration of different data types. We first systematically assessed the frequency and magnitude of the effects of dosage-sensitive loci on leaf morphological traits. Next, we assayed the effects of dosage on the expression of individual genes affecting leaf morphology. We then used gene co-expression networks to integrate different data types, summarizing biological features of higher order of gene interactions, and ultimately correlating gene modules and individual candidate genes with phenotypic traits (Figure 1). The end result was a multi-level view of how dosage affects gene expression and morphological variation, ranging from the overall genetic architecture and broader molecular mechanisms, to individual candidate genes underlying the observed morphological variation.

Our analyses detailed the complex responses of gene expression to dosage variation across the Populus genome, including how different types of gene expression responses correlated with leaf morphology variation (Figure 5). Large numbers of genes were found to be responsive to changes in their own dosage in terms of transcript number (i.e. cis- expression dQTL). Trans-expression dQTL were also discovered, including cases where large numbers of genes across the genome changed in expression in response to dosage changes at specific regions of the genome. Such cases are consistent with the idea that changes in selected dosage-sensitive trans-acting regulators (e.g. a gene encoding a transcription factor) can result in complex changes in gene expression that ultimately impact phenotypes. Interestingly, in our study, many of the dosage-sensitive regions associated with large numbers of trans-expression dQTL were pleiotropic and correlated with a wide variety of leaf morphometric and biomass-related traits (Figure 5). Such pleiotropic effects have been shown previously in natural populations of Populus for variation in the Populus ortholog of the Class III HD ZIP transcription factor, ptREVOLUTA (Porth et al., 2014). In contrast, chromosomal bins that were not associated with large numbers of trans-expression dQTL tended to control more specific morphological traits with no pleiotropic effects. Such cases would be consistent with modulation of a dosage-sensitive gene that affects the phenotype but not necessarily through direct transcriptional regulation, for example by affecting the stoichiometric relationships of the gene product within a protein complex, i.e. gene balance (Birchler and Veitia, 2012, 2010). Additionally, some regions associated with large numbers of trans-expression dQTL did not co-localize with any of the dQTL bins controlling the leaf morphology traits. Such cases could indicate the presence of transcriptional regulators that do not affect the morphological traits examined here.

Subsequent analyses demonstrated the increased power of prediction and biological inference through integration of multiple data types within a systems genetics framework, using gene co-expression networks as a primary point of integration. We mapped the cis- and trans-expression dQTL, as well as genes differentially expressed between lines with divergent phenotypes onto the gene co-expression network, and assessed individual genes for potential to regulate other genes and influence phenotypic traits. The power of integrating different data types was illustrated by increased phenotypic trait prediction (Figure 7), but perhaps more importantly data integration enabled perspectives ranging from overviews of mechanisms underlying trait variation to individual candidate genes. For example, correlations of co-expressed gene modules to morphological traits suggested a role for cell division and cell growth, as well as hormone signaling pathways and tissue polarity (Figure 6) previously identified as key mechanisms regulating leaf morphology in other species (Chitwood and Sinha, 2016; Kierzkowski et al., 2019). Then, by further dissecting the interactions and biological features of individual candidate genes within each of these module networks, we provided functional insights into interactions and regulatory relationships among genes within these modules. For example, candidate hub genes underlying the phenotype dQTL were identified that both preferentially mapped at the center of the modules and showed higher correlation with the phenotypic traits, as expected for regulatory genes that influence phenotypes. In contrast, trans-expression dQTL genes tended to map at the periphery of the network modules, suggesting downstream roles within regulatory pathways (Figure 9).

Dosage variation in the form of polyploidy has been previously linked to variation in organ sizes, including leaf size (Orr-Weaver, 2015). The effects of dosage variation at individual loci on morphology, as found for CNV, are less clear. Classical mutations of individual genes affecting leaf morphology in model inbred species like Arabidopsis and maize are typically not haploinsufficient (Meinke, 2013), which could be interpreted to mean that dosage variation is not relevant for morphological traits, or otherwise can be masked by genetic or physiological compensation. Alternatively, more subtle effects of dosage-sensitive loci on phenotypes might be undetected in a heterozygous state, while others might be among the many pleiotropic or lethal phenotypes identified in classical developmental genetic screens (Mayer et al., 1991; Meinke, 2020). In contrast, our quantitative approach allowed detection and quantification of subtle effects of dosage that would likely be missed in forward genetic screens aimed at genes underlying large effects. Indeed, we previously found frequent associations between dosage and biomass and phenology traits in poplar (Bastiaanse et al., 2019), similar to what we found here for leaf morphology. In this regard, our results are more aligned with recent findings in tomato, which demonstrated a link between frequent genome-wide variation in dosage and multiple agronomic and morphological quantitative traits (Soyk et al., 2019).

Our approach here reflects the expectation that morphological variation in natural populations results from diverse and nuanced sources of genetic variation. Similar to many temperate forest tree species, Populus are wind-pollinated, outcrossing (Populus is dioecious) species characterized by highly heterozygous individuals (Jansson et al., 2010). As such, populations can carry a variety of genetic variation affecting dosage including loss-of-function alleles, structural variation, and CNV, as illustrated by Populus pan-genome analyses that found nearly 8% of Populus gene models displayed CNV (Fehrmann et al., 2015; Pinosio et al., 2016). Our analyses here found a strong effect of induced dosage variation on leaf morphology (Figures 3, 4), suggesting the potential for naturally occurring CNV to contribute to morphological and other phenotypic variation in wild populations of Populus and other forest trees.

Studies of the evolution and development of leaf morphology have traditionally focused on amino acid sequence versus expression differences among species for individual genes shown to affect morphology in model plants (e.g. Arabidopsis, maize, and tomato; Tsukaya, 2005; Chitwood and Sinha, 2016; Runions and Tsiantis, 2017; Kierzkowski et al., 2019). In light of recent findings of prevalent structural and gene CNV in plants (Saxena et al., 2014; Pinosio et al., 2016; Sun et al., 2018; Yang et al., 2019), it is worthwhile to now reconsider the relative role of different evolutionary processes in the evolution of leaf morphology. In our experiment, dosage variation increased the overall phenotypic range observed in indels versus non-indel full sib genotypes (Supplemental File S1). Interestingly, indel dosage variants overlapped morphological variation seen in related Populus and Salix species (Figure 3), showing that dosage variation has the potential to enact rapid morphological change as observed within and among related species. Taken together, these results are consistent with the idea that dosage variation could affect morphological traits both within populations as well as during speciation (Feulner and De-Kayne, 2017), perhaps even facilitating rapid changes in phenotypes.

The work presented here is an entree for research aimed at understanding the role of natural dosage variation affecting not only traits of agronomic and ecological importance, but also the evolution and development of morphological traits in general. Our results demonstrate how leaf morphological variation can be generated using experimentally induced variation, but now investigations must identify and evaluate naturally occurring dosage variation and how it is parsed during selection and evolution. Populus is an excellent system for such investigations. For example, a whole genome duplication in the Salicoid lineage allows evaluation of duplicated paralogs regarding fractionation and dosage sensitivity, and pedigrees and large populations of Populus are available for genotyping and phenotyping (Bradshaw et al., 2000; Tuskan et al., 2006). In future studies, we will use genome-wide analyses to test hypotheses relating gene balance and genome stoichiometry to genome evolution and phenotypic traits (Birchler and Veitia, 2010).

Materials and methods

Creation and genomic analysis of irradiation hybrids in Populus

Methods for the creation, genomic analysis, and use of irradiation hybrid lines for mapping dQTL underlying phenotypic traits in field trials were described previously (Henry et al., 2015; Zinkgraf et al., 2017; Bastiaanse et al., 2019) and are summarized in Supplemental Figure S9. Briefly, two female genotypes of P. deltoides (SO546SL and SO598SL) were crossed to 100 gray gamma-irradiated pollen of male P. nigra (SO361SL). The resulting F1 populations were “IFG_100” (126 genotypes) and “GWR_100” (470 genotypes), respectively, for a total of 596 mutagenized lines. An additional 53 lines derived from a cross between SO546SL and the non-irradiated pollen of SO361SL were produced as controls (population “IFG_0”). Illumina low sequencing was used to identify and fine map insertions and deletions (indels) across the Populus genome (Henry et al., 2015; Bastiaanse et al., 2019). Clonally replicated copies (ramets) of each genotype were entered into field trials, measured for phenotypes, and dosage-sensitive loci affecting quantitative traits (dosage phenotypic QTL) were mapped based on correlations between relative dosage within genomic bins defined by indel breakpoints and phenotypic traits for each genotype (Supplemental Figure S1; Henry et al., 2015; Zinkgraf et al., 2017; Bastiaanse et al., 2019). Indel coverage of the poplar genome in the experiments here is summarized in Supplemental Figure S11.

Leaf phenotyping

The first fully expanded leaf was sampled from the apex of each three biological replicates (ramets) of the individual lines in the field in May 2015 and July 2016, placed under non-reflective glass on a light box, and photographed at a fixed distance using a Canon EOS Rebel T3i camera along with a ruler and a genotype identification tag. In July 2015, a second round of leaf sampling was performed to record stomata size, stomata density, as well as various spatial parameters of stomata distribution. For each leaf, clear nail varnish was used to make impressions of the abaxial and adaxial leaf epidermis in an approximately 1-cm2 area at the center of the leaf blade to the right side of the central vein, and avoiding secondary veins. Clear tape was used to retrieve the varnish peels which were then digitally imaged under bright field light microscopy at 10-fold magnification. Digital images were analyzed using the convolutional neural network program “Stomata Counter” (Fetter et al., 2019). Stomata identified in individual images were visually inspected to ensure accuracy, and manually re-annotated when necessary. The x- and y-coordinates of individual stomata from the Stomata Counter was fed into an R pipeline published by Naulin et al. (2017) to estimate stomatal density, mean NND, as well as stomatal clustering index. Stomatal clustering index was calculated as the observed average NND compared against the expected NND under complete spatial randomness. The ratio between the observed NND and the expected NND represents the stomatal clustering index and is equal to 1 if the pattern is random, is <1 if the pattern is clustered and is >1 if the pattern is uniform. Finally, the length of the guard cells from five randomly selected stomata per image was manually measured using ImageJ (Schneider et al., 2012; Naulin et al., 2017).

Leaf morphometric analysis

Traits describing leaf size and indicators of leaf shape were automatically retrieved from individual leaf images using Lamina software (Bylesjö et al., 2008). Leaf size descriptors included leaf area, perimeter, length, and width. Indicators of leaf shape included leaf circularity, the length-to-width ratio, as well as traits describing the vertical and horizontal asymmetry of the blade, calculated as the ratio of the leaf width measured at 25% and 75% of the leaf length, and the ratio of the leaf length measured at 25% and 75% of the leaf width, respectively. Lamina also quantified traits describing leaf indents, including number of indents, indent mean width, and indent mean depth (Supplemental Dataset S11).

Next, a global morphometric analysis of leaf shape was conducted using the R package “Momocs” (Bonhomme et al., 2014). While the Lamina leaf shape descriptors were restricted to measurements of leaf sizes relative to each other, the morphometric approach used by Momocs described the shape as a whole irrespective of variation in size. Digitized leaf images were converted to binary, and the outline of individual leaves was extracted by defining the closed polygon formed by the (x; y)-coordinates of the image pixels (Supplemental Figure S11, A). Next, the geometric information of individual leaf outlines was quantified using an elliptic Fourier series approach, retaining only the symmetric variation of the leaf shape (efourier and rm_asym functions of the package Momocs; Supplemental Figure S11, B). In the Fourier approach, the periodic function of an outline is decomposed into the sum of simpler trigonometric functions that have frequencies that are harmonics of one another (Bonhomme et al., 2014). When describing shapes, the lower harmonics typically provide approximations for the coarse-scale trends in the original periodic function of the true shape, while the higher harmonics add fine-scale variations and typically less information about overall shape. To select the appropriate number of harmonics for our study, we examined the spectrum of harmonic Fourier power on the full set of leaf shapes and found seven harmonics gave a satisfactory reconstruction of the leaf shape, gathering approximately 97% of the harmonic power and were stably inherited (Supplemental Figure S12, A–C). Quantitative variables from the Fourier series were then subjected to a PC analysis. We focused on the first four PCs (PC1, PC2, PC3, and PC4), and their respective ratios (PC1_PC2, PC1_PC3, PC2_PC3, PC1_PC4, PC2_PC4, and PC3_PC4), and treated these variables as phenotypic traits describing leaf shape in our subsequent genetic analysis.

Estimation of the BLUPs and broad sense heritability (H2) of the phenotypic traits

All statistical analysis of the Lamina, Momocs, and stomata traits (Supplemental Dataset S10) were performed using R 3.6.0 in the RStudio environment 1.1.456. Normal distribution of each phenotypic trait was assessed using a Shapiro–Wilk test. Non-normally distributed data were transformed using the Box–Cox transformation (Box and Cox, 1964). The phenotypic variance was assessed in a mixed linear model using the lmer procedure implemented in R using restricted maximum likelihood (REML) as: Yijk = μ + Fi + Pj + Lk(j) + εijk, where μ is the general mean, F is the effect of field i, considered as fixed, P is the effect of the population j considered as random, and L is the effect of the line k nested within the population j, considered as random. Phenotypic data from multiple years were jointly analyzed by adding year YR as a fixed factor effect to the models: Yijkl = μ + Fi + Pj + Lk(j) + YRl+ εijkl. The variance component estimates resulting from these analyses were used to estimate the broad sense heritability (H2) using the equation: H2 = Variance lines/Variance total, where “variance lines” represents the variance of the three clonal replicates of each line, nested within the two populations IFG and GWR in our model, and “variance total” represents the sum of the total variance of the model. The 95% confidence interval around the point estimate H2 was estimated using a bootstrap procedure with 1,000 simulations. Mixed linear models were also used to estimate the BLUPs of the effect of lines (nested within population) on the phenotypic estimates across years and environment (fields) . These BLUP estimates were used for the subsequent statistical and genomic analysis.

One way ANOVA using the R function “aov” was used to evaluate the mean and variance of the BLUP phenotypic values according to various grouping factors (e.g. indel versus control non-indel genotypes, deletion versus insertion lines). Where appropriate, the Tukey’s test was used for post hoc tests of significant differences between means using the R function “TukeyHSD” at the 5% significance level. A correlation matrix between the phenotypic traits was computed as the pairwise Pearson correlation coefficients using the “cor” function in R.

RNA sampling and RNA-Seq library preparation

In June 2016, the first expanding leaf from a side branch of all three clonal replicates of a subset of 164 lines was harvested in the field, and immediately frozen in individual tubes placed in an ethanol-dry ice bath. Frozen tissues were pulverized using the Qiagen TissueLyser, and total RNA extracted using the Norgen Plant Total RNA Kit (catalog #25800) along with the Norgen in-column DNAse I Kit (catalog #25710) and following the manufacturer’s instructions. RNA quality was assessed using a NanoDrop and 1.5% agarose gels. RNA concentrations were quantified using an Invitrogen Qubit RNA BR Assay Kit (catalog #Q10210).

RNA-Seq libraries were created using the KAPA mRNA Hyper Prep (catalog #KK8581) following the manufacturer’s instructions and ligated with in-house 8-bp dual index adapters. Libraries were enriched for seven–nine cycles and quantified using a plate fluorometer with SYBR Green I. For the first 20 lanes, libraries were pooled in equimolar amounts based on SYBR Green I values. After the first sequencing dataset was available, we re-pooled the libraries based on mapping percentages and re-sequenced in one NovaSeq lane. The first 20 lanes were sequenced as 100-bp SE, and the NovaSeq lane was sequenced as 150-bp PE. The sequencing was carried out at the DNA Technologies and Expression Analysis Cores at the UC Davis Genome Center, supported by the NIH Shared Instrumentation Grant 1S10OD010786-01. Sequencing data are available through NCBI SRA Accession PRJNA646735.

Preprocessing of the RNA-Seq sequence data and read mapping

The sequenced RNA-Seq libraries were demultiplexed using the Allprep multiplexed sequencing pipeline (https://github.com/Comai-Lab/allprep). Reads were trimmed based on a sliding 5-bp window sequence quality cutoff of phred score 20, and trimmed after incidences of “N” nucleotides in the read sequence. Libraries were then mapped using STAR aligner (Dobin et al., 2013) to map the P. trichocarpa v3.1 reference (https://phytozome.jgi.doe.gov/). The output alignment STAR BAM was indexed using SAMtools (Li et al., 2009) and final read counts were generated using htseq-count (Anders et al., 2015).

Estimation of the BLUEs and broad sense heritability (H2) of the gene expression traits

The estimated read counts were filtered such that only genes that have more than 10 reads per million in at least 80% of the libraries were retained. Normalization of the read counts was performed using the trimmed mean of M-values (TMM) method in the edgeR package (Robinson and Oshlack, 2010) and a variance stabilizing transformation was performed using voom (Law et al., 2014). Relationships between the mean and variance, before and after variance stabilization are presented in Supplemental Figure S13.

The variance of the normalized gene counts among and across replicated ramets of the lines was calculated and used to estimate the gene expression heritability using the HeritSeq R package (Rudra et al., 2017). Similarly to the phenotypic data manipulation, a linear model was used to estimate the best linear estimator (BLUE) of individual genes across clonal replicates (ramets) of the same genotype using the lmfit and eBayes functions of the Limma R package (Ritchie et al., 2015).

Gene expression network construction and visualization

To better understand the effect of dosage on gene expression regulation, and identify candidate genes controlling poplar leaf morphometry, we constructed a gene co-expression network using the weighted gene correlation network analysis (WGCNA) R package (Langfelder and Horvath, 2008) for the 164 indel lines. Genes were grouped into modules of co-expressed genes based on their topological overlap in the dissimilarity matrix, and the resulting modules are presented as a clustering tree (dendrogram) of genes in Supplemental Figure S6, A. Modules are assigned color labels by the WGCNA program.

We were mainly interested in the indirect co-regulation of genes due to dosage (trans-regulation) rather than the direct co-regulation of genes due to direct variation of physically linked genes due to CNV (cis-regulation). Therefore, for each indel line, we effectively “erased” the cis-effect of the dosage lesions by replacing the expression of the genes that are physically located under the indels with the average expression of the same genes in all other lines. The impact of this normalization is depicted in Supplemental Figure S6, B. For co-expression networks calculated in the absence of gene count normalization, approximately half the co-expression modules were highly enriched in genes that are physically located on the same chromosomes (left panel), while such colocalization of genes was not predominant for modules using normalized counts (right panel). The relationship between the modules of the first and second network is presented in Supplemental Figure S6, B.

WGCNA parameters (Zhang and Horvath, 2005) used a soft threshold power of 6 (corresponding to an R2-value of 0.82), unsigned network, minimum module size of 20 genes per module, and cut height of 0.99. Similar modules were merged based on the dynamic cut tree height of the dendrogram module’s eigengene expression value correlation matrix at the level of 0.3, meaning that modules showing an eigengene correlation of 0.7 or higher were merged. Edge and node files generated by WGCNA were exported using the exportNetworkToCytoscape function and visualized in Cytoscape 3.8.0 (Shannon et al., 2003) using the Prefused Force Directed Layout.

QTL analysis

The phenotypic BLUP values of morphometric traits were used in phenotypic dQTL analyses (phenotype dQTL) while the gene expression BLUE values of individual genes expressed in our leaf tissue were used for expression dQTL analysis (expression dQTL).

For all QTL analyses, dosage information was converted into a quantitative trait as previously described (Bastiaanse et al., 2019) and illustrated in Supplemental Figure S9. Briefly, for each line and each bin, a relative dosage ratio was calculated by dividing the gene copy number of the particular bin by the background ploidy of the particular line. Second, to assess the relationship between traits (phenotypic BLUP values, individual gene expression BLUE values, module’s eigengene expression) and gene dosage, Kendall rank correlation coefficients between relative dosage ratio and phenotypic/transcriptomic data were calculated for each individual bin. P-values associated with these coefficients were adjusted for multiple testing comparisons using a relaxed Bonferroni correction, in which the P-value of each test was multiplied by the number of independent chromosomal bins, as adjacent chromosomal bins are not independent and the majority of indels span more than one bin. The effective number of independent bins was calculated using a dissimilarity matrix (one-pairwise Pearson correlation coefficients) of the bins’ relative dosage ratios across all mutant lines. Next, these pairwise correlation coefficients were used in a hierarchical clustering method using the “hcluster” function in R. Similar branches of the cluster dendrogram were merged using a static dendrogram tree cut off value of 0.7 (function “cutree” in R), meaning that bins corresponding to a correlation coefficient of 0.3 or above were defined as one cluster (=independent test) for the P-value adjustment (Supplemental Figure S5). Using this method, the 830 bins resolved into 45 independent clusters. Significant phenotype dQTL and module dQTL were identified as those with an adjusted P-value <0.1, and P < 0.01 for expression dQTL. For all QTL analyses, consecutive significant bins were pooled into a single larger QTL. The percentage of variance explained by individual and multiple QTL was estimated by computing the adjusted R-squared of the multivariate linear regression model fitting all the genomic bins underlying the QTL.

Identification of QTL hotspots

We identified phenotypic dQTL and trans-expression dQTL hotspots using the technique presented by Rae et al. (2009). Briefly, the length and position of the QTL were permuted across the genome 1,000 times, and a sliding window of 100 kb was used to count the number of QTL in each window. For each permutation, the maximum number of QTL per window region was recorded and sorted. The top 950th value among the 1,000 permutations (α = 0.05 significance level) was defined as the critical value for declaring a significant “hotspot” of QTL.

Identifying potential candidate genes for the control of morphometric leaf traits based on differential expression in extreme mutants

To search for potential candidate genes underlying each phenotype dQTL, genes were identified that were located under the phenotype dQTL and exhibited expression variation correlated with phenotypic traits, but did so independently of the indels located under the dQTL. To accomplish this, for each gene list in cis- of the phenotype dQTL bins, we used edgeR (Robinson et al., 2010) to perform differential expression analysis among the lines exhibiting the most extreme phenotypic trait values (<10% and >90% of the distribution) and lacking indels under the corresponding dQTL bins.

Comparing QTL, differential expression, and WGCNA approaches in predicting the leaf morphometric traits

To compare the effectiveness of genomic dosage, individual gene expression, and gene co-expression network data types alone, as well as a combination of these data at predicting the leaf morphometric trait variation, we constructed a series of multivariate linear regression models using the lm function in R. For each phenotypic trait, linear models were fitted using: (1) the relative gene dosage ratio under phenotype dQTL bins, (2) eigengene expression of differentially expressed genes under individual phenotype dQTL bins, (3) the eigengene expression of the WGCNA modules found to be correlated with the phenotypic trait at P < 0.05, and (4) a combination of all three data types.

Identifying potential candidate genes for the control of morphometric leaf traits based on the genetic architecture of our QTL regions

The results of the various differential expression and QTL analyses were integrated using a system genetics approach to provide additional insights into traits and candidate genes. Effects of individual candidate genes were evaluated toward both the phenotypic traits, as well as in the broader context of their respective gene co-expression module. To achieve this, we mapped the cis- and trans-expression dQTL genes, as well as the differentially expressed genes associated with the phenotype dQTL bins onto the gene co-expression network. Then for each candidate gene, the Pearson correlation with the module eigengene expression (module membership) and with the phenotypic trait (gene significance) was determined. Genes that were both highly connected to the other genes within a co-expression module (as expected for a potential regulator of other genes within the module) and highly correlated itself with the phenotypic trait (high gene significance) were prioritized in the candidate gene list, and their potential biological roles evaluated based on functional annotations.

To evaluate which biological process the genes contributing to each of these modules were involved in, we performed a GO enrichment analysis. In this analysis, we compared the annotation of the poplar genes present in each of the WGCNA modules, against the genes found to be expressed in our leaf tissue. GO enrichment was performed using the R package topGO. Each poplar reference gene was annotated with the GO terms associated with its Arabidopsis homolog using the org.At.tair.db R package. Categories were considered significantly enriched if the false discovery rate adjusted P-value < 0.05.

Enrichment of the prioritized candidate gene list in particular families of transcription factors when compared with all the genes surveyed in our study was evaluated by performing a Fisher’s exact test (fisher.test R function) at P-value < 0.05 and using the transcription factor poplar database provided by Zhu et al. (2007). Enrichment of classes of candidate genes (cis-, trans-expression dQTL, and DE genes) in the high quartile of the module membership versus gene significance scatter plot was performed using a Fisher’s exact test at P-value < 0.05.

Measures of statistical significance, percentage of variance explained, correlation values, and other outputs and details of individual statistical analyses are associated with the individual analyses summarized in the Supplemental Datasets.

Accession numbers

Accession numbers are included in the Supplemental Datasets.

Supplemental data

Supplemental Figure S1 . Pairwise Pearson correlation matrix of the phenotypic traits.

Supplemental Figure S2. Relationship between leaf size and leaf shape clustering.

Supplemental Figure S3. Trait heritability values.

Supplemental Figure S4. Summary of the indel mutations identified the P. deltoides × γ-irradiated P. nigra population for which we sequenced the leaf transcriptome.

Supplemental Figure S5. Illustration of the correlated nature of our indel chromosomal bins and the method used to define the clusters of dependent bins.

Supplemental Figure S6. Comparison of the co-expression networks before and after gene expression normalization based on CNV.

Supplemental Figure S7. Enrichment in transcription factors in each set of genes identified using the various analysis approaches.

Supplemental Figure S8. Percentage of the variance explained by the various approaches.

Supplemental Figure S9. Production of irradiation hybrid lines and chromosomal bin analysis for the identification of dQTL.

Supplemental Figure S10. Summary of the indel mutations identified in the P. deltoides × γ-irradiated P. nigra population phenotyped in an experimental field.

Supplemental Figure S11. Illustration of the elliptic Fourier analysis performed on one particular leaf sample.

Supplemental Figure S12 . Optimization of the number of harmonics used in the Fourier series.

Supplemental Figure S13. Variance stabilization of the raw gene counts.

Supplemental File S1. Distribution of the raw phenotypic trait values according to the line indel status.

Supplemental File S2. Manhattan plots of the −log10 (adjusted P-values) of the Kendall-rank correlation tests between the relative dosage ratio and the phenotypic BLUPs calculated for each individual chromosomal bin.

Supplemental Dataset S1. Independence of the distribution of lesion and control non-lesion lines was tested in each of the 5 K-means clusters of the first and second principal.

Supplemental Dataset S2. Point estimates and 1,000 bootstraps confidence interval of the broad-sense heritability of the phenotypic traits.

Supplemental Dataset S3. List of the phenotype dQTL identified in the P. deltoides × gamma-irradiated P. nigra population.

Supplemental Dataset S4. Heritabilty estimates and associated P-values of the linear mixed models for individual gene expression data across the three replicated ramets of individual lines composing the gene expression dataset.

Supplemental Dataset S5. GO terms under the category biological process found to be enriched in each WGCNA modules at P < 0.05.

Supplemental Dataset S6. List and annotation of the differentially expressed genes under the phenotype dQTL intervals of extreme phenotypic mutant.

Supplemental Dataset S7. For each phenotypic trait, number of differentially expressed genes in extreme phenotypic mutant.

Supplemental Dataset S8. Phenotype dQTL + DE genes + module eigengene.

Supplemental Dataset S9. List of the cis- and trans-expression QTL genes, as well as the differentially expressed genes in extreme phenotypic mutants underlying the phenotype dQTL bins.

Supplemental Dataset S10. Number of genes highlighted by one, two, or three analysis (“line of evidence”) in each QTL interval, in comparison to the total number of annotated genes in this same interval based on the reference P. trichocarpa genome.

Supplemental Dataset S11. List of the leaf phenotypic trait analyzed and their year of measurement.

Supplementary Material

koaa016_Supplementary_Data

Acknowledgments

The authors thank Brian Stanton and Kat Haiby of Greenwood Resources for technical assistance and germplasm. They also thank summer intern students from the Institute of Forest Genetics Summer Intern program Keishla Perez, Mark Wolford, Maricela Abarca, Jorge Sanchez, Noelani Parker, Christopher D. Veloira, and Daniela Rodriguez-Zaccaro for assistance in digitizing poplar leaves.

Funding

This work was supported by the joint USDA DOE Feedstock Genomics Programs (DOE Grant DE-SC0005581).

Conflict of interest statement. None declared.

Contributor Information

Héloïse Bastiaanse, Pacific Southwest Research Station, US Forest Service, Davis, California 95618; Genome Center, University of California Davis, Davis 95616.

Isabelle M Henry, Genome Center, University of California Davis, Davis 95616; Department of Plant Biology, University of California Davis, Davis 95616.

Helen Tsai, Genome Center, University of California Davis, Davis 95616; Department of Plant Biology, University of California Davis, Davis 95616.

Meric Lieberman, Genome Center, University of California Davis, Davis 95616; Department of Plant Biology, University of California Davis, Davis 95616.

Courtney Canning, Pacific Southwest Research Station, US Forest Service, Davis, California 95618.

Luca Comai, Genome Center, University of California Davis, Davis 95616; Department of Plant Biology, University of California Davis, Davis 95616.

Andrew Groover, Pacific Southwest Research Station, US Forest Service, Davis, California 95618; Genome Center, University of California Davis, Davis 95616.

H.B. oversaw the collection of phenotypic data and performed the statistical analyses. H.B. and A.G. drafted the manuscript. H.T. produced the RNA sequencing libraries. M.L. processed and mapped the sequencing reads. C.C. oversaw the plant growth and assisted in primary data collection. I.M.H., L.C., and A.G. contributed to the statistical analyses, and conceived of the project. I.M.H. and L.C. edited the manuscript. A.G. directed the overall project.

The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (https://academic.oup.com/plcell) is: Andrew Groover (agroover@fs.fed.us).

References

  1. Alonge M, Wang X, Benoit M, Soyk S, Pereira L, Zhang L, Suresh H, Ramakrishnan S, Maumus F, Ciren D, et al. (2020) Major impacts of widespread structural variation on gene expression and crop improvement in tomato. Cell  182: 145–161.e23 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Amborella Genome Project (2013) The Amborella genome and the evolution of flowering plants. Science  342: 1241089. [DOI] [PubMed] [Google Scholar]
  3. Anders S, Pyl PT, Huber W (2015) HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics  31: 166–169 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bastiaanse H, Zinkgraf M, Canning C, Tsai H, Lieberman M, Comai L, Henry I, Groover A (2019) A comprehensive genomic scan reveals gene dosage balance impacts on quantitative traits in Populus trees. Proc Natl Acad Sci U S A  116: 13690–13699 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Birchler JA, Newton KJ (1981) Modulation of protein levels in chromosomal dosage series of maize: the biochemical basis of aneuploid syndromes. Genetics  99: 247–266 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Birchler JA, Veitia RA (2012) Gene balance hypothesis: connecting issues of dosage sensitivity across biological disciplines. Proc Natl Acad Sci U S A  109: 14746–14753 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Birchler JA, Veitia RA (2010) The gene balance hypothesis: implications for gene regulation, quantitative traits and evolution. New Phytol  186: 54–62 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Blakeslee AF (1921) The globe, a simple trisomic mutant in Datura. Proc Natl Acad Sci U S A  7: 148–152 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Blakeslee AF (1922) Variations in Datura due to changes in chromosome number. Am Nat  56: 16–31 [Google Scholar]
  10. Bonhomme V, Picq S, Gaucherel C, Claude J (2014) Momocs: Outline Analysis UsingR. J Stat Softw  56: 1–24 [Google Scholar]
  11. Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc B Methodol  26: 211–243 [Google Scholar]
  12. Bradshaw HD, Ceulemans R, Davis J, Stettler R (2000) Emerging model systems in plant biology: poplar (Populus) as a model forest tree. J Plant Growth Regul  19: 306–313 [Google Scholar]
  13. Bylesjö M, Segura V, Soolanayakanahally RY, Rae AM, Trygg J, Gustafsson P, Jansson S, Street NR (2008) LAMINA: a tool for rapid quantification of leaf size and shape parameters. BMC Plant Biol  8: 82. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Causse M, Desplat N, Pascual L, Le Paslier M-C, Sauvage C, Bauchet G, Bérard A, Bounon R, Tchoumakov M, Brunel D, et al. (2013) Whole genome resequencing in tomato reveals variation associated with introgression and breeding events. BMC Genomics  14: 791. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Chitwood DH, Sinha NR (2016) Evolutionary and environmental forces sculpting leaf development. Curr Biol  26: R297–R306 [DOI] [PubMed] [Google Scholar]
  16. Civelek M, Lusis AJ (2014) Systems genetics approaches to understand complex traits. Nat Rev Genet  15: 34–48 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics  29: 15–21 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Fehrmann RSN, Karjalainen JM, Krajewska M, Westra H-J, Maloney D, Simeonov A, Pers TH, Hirschhorn JN, Jansen RC, Schultes EA, et al. (2015) Gene expression analysis identifies global gene dosage sensitivity in cancer. Nat Genet  47: 115–125 [DOI] [PubMed] [Google Scholar]
  19. Fetter KC, Eberhardt S, Barclay RS, Wing S, Keller SR (2019) StomataCounter: a neural network for automatic stomata identification and counting. New Phytol  223: 1671–1681 [DOI] [PubMed] [Google Scholar]
  20. Feulner PGD, De-Kayne R (2017) Genome evolution, structural rearrangements and speciation. J Evol Biol  30: 1488–1490 [DOI] [PubMed] [Google Scholar]
  21. Henry IM, Dilkes BP, Miller ES, Burkart-Waco D, Comai L (2010) Phenotypic consequences of aneuploidy in Arabidopsis thaliana. Genetics  186: 1231–1245 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Henry IM, Zinkgraf MS, Groover AT, Comai L (2015) A system for dosage-based functional genomics in poplar. Plant Cell  27: 2370–2383 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Jansson S, Bhalerao R, Groover A (2010) Genetics and Genomics of Populus. Springer, New York, Dordrecht, Heidelberg, London [Google Scholar]
  24. Kierzkowski D, Kierzkowski D, Runions A, Vuolo F, Strauss S, Lymbouridou R, Routier-Kierzkowska A-L, Wilson-Sánchez D, Jenke H, Galinha C, et al. (2019) A growth-based framework for leaf shape development and diversity. Cell  177: 1405–1418.e17 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Koornneef M, Van der Veen JH (1983) Trisomics in Arabidopsis thaliana and the location of linkage groups. Genetica  61: 41–46 [Google Scholar]
  26. Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics  9:559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Law CW, Chen Y, Shi W, Smyth GK (2014) voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol  15: R29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup (2009) The sequence alignment/map format and SAMtools. Bioinformatics  25: 2078–2079 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Lloyd J, Meinke D (2012) A comprehensive dataset of genes with a loss-of-function mutant phenotype in Arabidopsis. Plant Physiol  158: 1115–1129 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Lye ZN, Purugganan MD (2019) Copy number variation in domestication. Trends Plant Sci  24: 352–365 [DOI] [PubMed] [Google Scholar]
  31. Mayer U, Torres Ruiz RA, Berleth T, Miséra S, Jürgens G (1991) Mutations affecting body organization in the Arabidopsis embryo. Nature  353: 402–407 [Google Scholar]
  32. Meinke DW (2013) A survey of dominant mutations in Arabidopsis thaliana. Trends Plant Sci  18: 84–91 [DOI] [PubMed] [Google Scholar]
  33. Meinke DW (2020) Genome‐wide identification of EMBRYO-DEFECTIVE (EMB) genes required for growth and development in Arabidopsis. New Phytol  226: 306–325 [DOI] [PubMed] [Google Scholar]
  34. Moreno-Moral A, Pesce F, Behmoaras J, Petretto E (2017) Systems genetics as a tool to identify master genetic regulators in complex disease. Methods Mol Biol  1488: 337–362 [DOI] [PubMed] [Google Scholar]
  35. Moreno-Moral A, Petretto E (2016) From integrative genomics to systems genetics in the rat to link genotypes to phenotypes. Dis Model Mech  9: 1097–1110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Nadeau JH, Dudley AM (2011) Systems genetics. Science  331: 1015–1016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Naulin PI, Valenzuela G, Estay SA (2017) Size matters: point pattern analysis biases the estimation of spatial properties of stomata distribution. New Phytol  213: 1956–1960 [DOI] [PubMed] [Google Scholar]
  38. Orr-Weaver TL (2015) When bigger is better: the role of polyploidy in organogenesis. Trends Genet  31: 307–315 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Pinosio S, Giacomello S, Faivre-Rampant P, Taylor G, Jorge V, Le Paslier MC, Zaina G, Bastien C, Cattonaro F, Marroni F, et al. (2016) Characterization of the poplar pan-genome by genome-wide identification of structural variation. Mol Biol Evol  33: 2706–2719 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Porth I, Klápště J, McKown AD, La Mantia J, Hamelin RC, Skyba O, Unda F, Friedmann MC, Cronk QCB, Ehlting J, et al. (2014) Extensive functional pleiotropy of REVOLUTA substantiated through forward genetics. Plant Physiol  164: 548–554 [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Rae AM, Street NR, Robinson KM, Harris N, Taylor G (2009) Five QTL hotspots for yield in short rotation coppice bioenergy poplar: the poplar biomass loci. BMC Plant Biol  9: 23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK (2015) limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res  43: e47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics  26: 139–140 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Robinson MD, Oshlack A (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol  11: R25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Rudra P, Shi WJ, Vestal B, Russell PH, Odell A, Dowell RD, Radcliffe RA, Saba LM, Kechris K (2017) Model based heritability scores for high-throughput sequencing data. BMC Bioinformatics  18: 143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Runions A, Tsiantis M (2017) The shape of things to come: from typology to predictive models for leaf diversity. Am J Bot  104: 1437–1441 [DOI] [PubMed] [Google Scholar]
  47. Saxena RK, Edwards D, Varshney RK (2014) Structural variations in plant genomes. Brief Funct Genomics  13: 296–307 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Schneider CA, Rasband WS, Eliceiri KW (2012)  NIH image to ImageJ: 25 years of image analysis. Nat Methods  9: 671–675 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res  13: 2498–2504 [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Singh RJ (2016) Plant Cytogenetics. CRC Press [Google Scholar]
  51. Song MJ, Potter BI, Doyle JJ, Coate JE (2020) Gene balance predicts transcriptional responses immediately following ploidy change in. Plant Cell  32: 1434–1448 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Soyk S, Lemmon ZH, Sedlazeck FJ, Jiménez-Gómez JM, Alonge M, Hutton SF, Van Eck J, Schatz MC, Lippman ZB (2019) Duplication of a domestication locus neutralized a cryptic variant that caused a breeding barrier in tomato. Nat Plants  5: 471–479 [DOI] [PubMed] [Google Scholar]
  53. Stranger BE, Forrest MS, Dunning M, Ingle CE, Beazley C, Thorne N, Redon R, Bird CP, de Grassi A, Lee C, et al. (2007) Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science  315: 848–853 [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Sun S, Zhou Y, Chen J, Shi J, Zhao H, Zhao H, Song W, Zhang M, Cui Y, Dong X, et al. (2018) Extensive intraspecific gene order and gene structural variations between Mo17 and other maize genomes. Nat Genet  50: 1289–1295 [DOI] [PubMed] [Google Scholar]
  55. Swanson-Wagner RA, Eichten SR, Kumari S, Tiffin P, Stein JC, Ware D, Springer NM (2010) Pervasive gene content variation and copy number variation in maize and its undomesticated progenitor. Genome Res  20: 1689–1699 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Tsukaya H (2005) Leaf shape: genetic controls and environmental factors. Int J Dev Biol  49: 547–555 [DOI] [PubMed] [Google Scholar]
  57. Tsukaya H (2006) Mechanism of leaf-shape determination. Annu Rev Plant Biol  57: 477–496 [DOI] [PubMed] [Google Scholar]
  58. Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A, et al. (2006)  The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science  313: 1596–1604 [DOI] [PubMed] [Google Scholar]
  59. Verlinden MS, Broeckx LS, Van den Bulcke J, Van Acker J, Ceulemans R  2013.   Comparative study of biomass determinants of 12 poplar (Populus) genotypes in a high-density short-rotation culture. Forest Ecol Manage  307: 101–111 [Google Scholar]
  60. Xia W, Xiao Za, Cao P, Zhang Y, Du K, Wang N. (2018)  Construction of a high-density genetic map and its application for leaf shape QTL mapping in poplar. Planta  248: 1173–1185 [DOI] [PubMed] [Google Scholar]
  61. Xie Y, Huhn K, Brandt R, Potschin M, Bieker S, Straub D, Doll J, Drechsler T, Zentgraf U, Wenkel S (2014)  REVOLUTA and WRKY53 connect early and late leaf development in Arabidopsis. Development  141: 4772–4783 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Yang Z, Ge X, Yang Z, Qin W, Sun G, Wang Z, Li Z, Liu J, Wu J, Wang Y, et al. (2019)  Extensive intraspecific gene order and gene structural variations in upland cotton cultivars. Nat Commun  10:2989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol  4:17. [DOI] [PubMed] [Google Scholar]
  64. Zhu Q-H, Guo A-Y, Gao G, Zhong Y-F, Xu M, Huang M, Luo J (2007) DPTF: a database of poplar transcription factors. Bioinformatics  23: 1307–1308 [DOI] [PubMed] [Google Scholar]
  65. Zinkgraf M, Haiby K, Lieberman MC, Comai L, Henry IM, Groover A (2017) Creation and genomic analysis of irradiation hybrids in Populus. Curr Protoc Plant Biol  1: 431–450 [DOI] [PubMed] [Google Scholar]
  66. Zmienko A, Marszalek-Zenczak M, Wojciechowski P, Samelak-Czajka A, Luczak M, Kozlowski P, Karlowski WM, Figlerowicz M (2020) AthCNV: a map of DNA copy number variations in the Arabidopsis Genome. Plant Cell  32: 1797–1819 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

koaa016_Supplementary_Data

Articles from The Plant Cell are provided here courtesy of Oxford University Press

RESOURCES