Skip to main content
PLOS One logoLink to PLOS One
. 2021 Oct 28;16(10):e0259278. doi: 10.1371/journal.pone.0259278

Multivariate genome-wide association study of leaf shape in a Populus deltoides and P. simonii F1 pedigree

Wenguo Yang 1,2, Dan Yao 1, Hainan Wu 1, Wei Zhao 1, Yuhua Chen 1, Chunfa Tong 1,*
Editor: Karthikeyan Adhimoolam3
PMCID: PMC8553126  PMID: 34710178

Abstract

Leaf morphology exhibits tremendous diversity between and within species, and is likely related to adaptation to environmental factors. Most poplar species are of great economic and ecological values and their leaf morphology can be a good predictor for wood productivity and environment adaptation. It is important to understand the genetic mechanism behind variation in leaf shape. Although some initial efforts have been made to identify quantitative trait loci (QTLs) for poplar leaf traits, more effort needs to be expended to unravel the polygenic architecture of the complex traits of leaf shape. Here, we performed a genome-wide association analysis (GWAS) of poplar leaf shape traits in a randomized complete block design with clones from F1 hybrids of Populus deltoides and Populus simonii. A total of 35 SNPs were identified as significantly associated with the multiple traits of a moderate number of regular polar radii between the leaf centroid and its edge points, which could represent the leaf shape, based on a multivariate linear mixed model. In contrast, the univariate linear mixed model was applied as single leaf traits for GWAS, leading to genomic inflation; thus, no significant SNPs were detected for leaf length, measures of leaf width, leaf area, or the ratio of leaf length to leaf width under genomic control. Investigation of the candidate genes showed that most flanking regions of the significant leaf shape-associated SNPs harbored genes that were related to leaf growth and development and to the regulation of leaf morphology. The combined use of the traditional experimental design and the multivariate linear mixed model could greatly improve the power in GWAS because the multiple trait data from a large number of individuals with replicates of clones were incorporated into the statistical model. The results of this study will enhance the understanding of the genetic mechanism of leaf shape variation in Populus. In addition, a moderate number of regular leaf polar radii can largely represent the leaf shape and can be used for GWAS of such a complicated trait in Populus, instead of the higher-dimensional regular radius data that were previously considered to well represent leaf shape.

Introduction

Leaves are the most fundamental photosynthetic organs in plants; they are responsible for absorbing solar energy to generate power for plant growth and thus provide food for many species on earth [1, 2]. Leaf morphology exhibits tremendous diversity between or within species, such as the broad leaves of poplars and needle leaves of conifers. Leaf size and shape are evolutionarily adapted to environmental changes in response to water and light stress [3, 4], making it possible to reconstruct the paleoclimate [5, 6]. In model systems, several genes and networks have been identified to affect initial leaf development and pattern formation [2, 7, 8] as well as leaf length and width [911] using the mutagenesis screening method. Moreover, quantitative trait loci have also been detected for leaf morphological traits in species such as tomato [12], Arabidopsis [13], Brassica [14], maize [15], barley [16], and Populus [17, 18]. Despite advances made in these studies, the identified genes or loci may only cover a portion of the leaf morphological variation observed in nature because the variation is considered to be under polygenic control [11, 19].

The genus Populus (2n = 38) is an ecologically and economically important tree with a wide distribution in diverse environments of the Northern Hemisphere [20, 21]. The genus, comprising approximately 30 species, was grouped into six sections (i.e., Abaso, Aigeiros, Leucoides, Populus, Tacamahaca, and Turanga) according to morphological parameters [22]. Most species have several attractive biological characteristics, such as fast growth and easy asexual reproduction, so they are of particular interest to forest breeders for developing new cultivars to meet the needs of pulp, paper, lumber, and biofuels industries. Several studies have shown that leaf traits are highly related to growth and habitat and can be used as predictors of productivity and determinant factors in phylogenetic relationships [11, 23, 24]. Therefore, persistent efforts have been made to dissect the genetic mechanism of morphological traits in the genus. In the 1990s, Wu et al. [25] first conducted QTL mapping of leaf morphology in F2 hybrids of P. trichocarpa × P. deltoides, with up to 3 QTLs identified for each trait, leaf area and the ratio of leaf length to width, at four crown positions. Recently, Mckown et al. [26] found 6 and 5 SNPs significantly associated with leaf length and width, respectively, in a GWAS on unrelated wild accessions of Populus trichocarpa. Drost et al. [11] detected 2 QTLs for lamina length, 2 for width, and 5 for their ratio in a pseudobackcross population of P. trichocarpa and P. deltoides. More recently, Chhetri et al. [17, 18] performed a GWAS of many traits with different genotypes from natural populations of P. trichocarpa; they did not detect any significant SNPs for single leaf traits, including leaf length, width, perimeter, area, and aspect ratio, but the detected up to 9 SNPs for leaf morphology multitraits. In contrast, Fu et al. [27] precisely described leaf shape with radii from the centroid to the contour at regular intervals and performed a marker-trait association analysis of principal components of the high-dimensional radius data, leading to several QTLs identified for leaf shape in a natural population of P. szechuanica var tibetica. They further modeled the leaf contour of a QTL genotype as a dynamic trajectory and identified a few significant QTLs for the variation of leaf shape in the same population [28]. These studies could be considered an initial stage for unravelling the genetic mechanism behind leaf size and shape in Populus. More powerful strategies, such as the utilization of novel statistical methods and generation of more accurate phenotype and genotype data, should be taken into account to ensure the accuracy and precision of such a tough task.

Herein, we report a genome-wide association study (GWAS) of leaf size and shape with a randomized complete block design (RCBD), which was established using clones from an F1 hybrid population of P. deltoides and P. simonii [29] belonging to the sections Aigeiros and Tacamahaca, respectively. The leaves of the female P. deltoides are broad, while those of the male P. simonii parent are narrow. This sharp contrast led to diverse leaf shapes in their progeny (Fig 1). The leaf traits were digitally derived from scanned images of leaves, including the classical indices of leaf length, width, and area, as well as the high-dimensional data of regular ordered radii between the leaf centroid and edge points, as described in Fu et al. [27]. Single nucleotide polymorphism (SNP) genotypes of each clone were generated by mapping the paired-end (PE) reads from restriction site-associated DNA sequencing (RADseq) to the reference genome of P. trichocarpa [21]. With these SNP data, we applied a linear mixed model (LMM) to conduct GWAS for multiple or single leaf traits using the R package EMMREML (https://cran.rproject.org/web/packages/EMMREML). Consequently, we identified many more SNPs significantly associated with leaf shape than those detected in previous studies. Furthermore, candidate genes associated with these SNPs were investigated to show that most flanking regions of these significant SNPs harbored genes that were related to leaf growth and development and to the regulation of leaf morphology. The results enhanced our understanding of the genetic mechanism of leaf shape variation in Populus. We demonstrated that the combined use of the traditional experimental design and the multivariate linear mixed model (mvLMM) could greatly improve the power of GWAS for leaf shape. Additionally, the multivariate data of a moderate number of regular leaf polar radii can largely represent the leaf shape and can be used for GWAS of such a complicated trait in Populus. This is contrary to the expectation that the high-dimensional regular radius data could well represent the leaf shape for GWAS, as indicated by Fu et al. [27].

Fig 1. Leaf shape variation among parents and progeny of P. deltoides × P. simonii.

Fig 1

Materials and methods

Genetic experimental design and measurement of leaf traits

To obtain repeated phenotypic data for accurate QTL analysis, we established an RCBD in the spring of 2017 with clones from an F1 hybrid population of the female P. deltoides and the male P. simonii, which was produced in the spring of 2009 to 2011 [29, 30]. The design consisted of 234 clones with 3 blocks, 6 cuttings for each clone per plot within a block, and a 50 × 60 cm spacing in Xiashu Forest Farm of Nanjing Forestry University, Jurong County, Jiangsu Province, China (32.1224°N, 119.2155°E). The sixth most apical mature leaf of each individual was sampled in mid-July 2017 by placing it into a paper bag and then scanned using a Hewlett-Packard Scanjet G2410 A4 flatbed scanner at a resolution of 300 dpi. The A4-sized images were saved as bmp files, and leaf size and shape traits were analyzed with the R package LeafShape (https://github.com/tongchf/LeafShape). These traits included leaf area (A), length (L), maximum width (W), and widths at one-third length (W1/3), half length (W1/2), and two-thirds length (W2/3) from the base, as well as 360 regular polar radii (RD360) between the leaf centroid and edge points, as shown in Fig 2. As our primary analytical step, we applied multivariate statistical methods to these leaf traits with SAS 9.3 software (SAS Institute, Cary, USA), including canonical correlation analysis and principal component analysis.

Fig 2.

Fig 2

Workflow of the R package LeafShape for extracting leaf shape traits in poplar hybrids: (A) a fresh leaf; (B) original (blue) and position-adjusted (red) edge points extracted from the scanned image of a leaf; (C) the red vertical line indicates the leaf length (L), the red horizontal line indicates the maximum leaf width (W), and the blue horizontal lines represent the leaf widths at one-third length (W1/3), half length (W1/2), and two-thirds length (W2/3); (D) 360 regular polar radii between the leaf centroid and edge points across −π to π.

SNP genotyping

Since 2013, more than four hundred individuals in the poplar hybrid population have been sequenced with RADseq technology in several batches [29, 30]. The 163 clones from the RCBD experiment and their two parents were sequenced previously, and their RADseq paired-end (PE) reads were deposited in the NCBI SRA database with the accession numbers listed in S1 Table. The PE reads for each individual were filtered using the NGS QC toolkit with default parameters [31], and the resulting high-quality (HQ) read data were used for calling SNP genotypes. The SNP calling procedures were largely the same as those described in Mousavi et al. [30]. In brief, the PE reads of each clone or parent were first aligned to the reference sequence of P. trichocarpa with the software BWA v0.7.17 [32]. Second, the resulting SAM (sequence alignment/map) file was converted into a BAM (binary alignment/map) formatted file and then sorted and indexed with SAMtools v1.9 [33]. Third, the sorted BAM file was analyzed to generate a VCF file using BCFtools v1.9 software [33]. Finally, the VCF file was filtered to obtain SNP genotypes of each individual such that a heterozygous allele had a read coverage depth (DP) of at least 3 and the quality of each SNP genotype was greater than 30.

Because the 163 clones were from an F1 hybrid population, the SNPs were expected to segregate in several different patterns, such as aa×ab and ab×ab, due to the characteristics of outbred forest species [34, 35]. We classified the SNPs into subsets according to the segregation patterns and kept those that did not seriously deviate from Mendelian segregation ratios (p > 0.01), including 1:1, 1:2:1, and 1:1:1:1. In addition, if an SNP had more than 10% missing genotypes across the clones, it was removed from the dataset. To obtain independent SNP markers and linkage disequilibrium (LD) blocks, we performed the LD-based SNP pruning procedure for all the SNPs using PLINK v1.07 software with a window size of 25 SNPs, a step size of 2 SNPs, and an r2 threshold of 0.7 [36].

Statistical methods for GWAS

Since the poplar leaf is generally symmetrical, the polar radii on the right side largely contain the full information of all the radius data on both sides (Fig 2D). We used the multiple traits of different numbers of regular radii across −π/2 to π/2 to find SNPs associated with leaf shape, which was implemented with the mvLMM as follows:

yijkl=μl+Bil+Mjl+Gjl+eijkl (1)

where yijkl is the lth polar radius of the kth tree leaf of the jth clone in the ith block; μl is the overall mean of the lth polar radius; Bil is the effect of the ith block; Mjl is the genotype effect of the jth clone at any tested SNP; Gjl is the polygenic background effect of the jth clone; and eijkl is the residual effect. It is assumed that Bil and Mjl are fixed effects, while Gjl and eijkl are random effects with GjlN(0,σgl2), eijklN(0,σel2) and cov(Gjl, eijkl) = 0. In matrix form, model (1) can be written as

Y=XB+ZG+E (2)

where Y is the n×t matrix whose (I, j)th element is the jth trait value of the ith individual, i.e., the ith row of Y is the multiple trait data for the ith individual; X is an n×p known design matrix of fixed effects, including the overall mean, block effects, and individual genotype effects at the tested SNP site; B is the p×t matrix of fixed effect coefficients; G is the c×t matrix whose (i, j)th element is the background random additive genetic effect of the ith clone and the jth trait with Vec(G)~Nc×t(0, VGA), where Vec denotes the matrix vectorization function [37], A is the additive relationship matrix for the c clones and VG is the additive genetic variance matrix for the t traits; Z is the n×c coefficient matrix corresponding to the matrix G; E is the random residual matrix with Vec(E)~Nn×t(0, VEIn). Based on the assumptions above, the covariance matrix of Vec(Y) can be derived as

V=cov(Vec(Y))=VGZAZ+VEIn (3)

Because the clones in our experiment belonged to a full-sib family and their parents were unrelated and not inbred, the kinship coefficient of any two clones is 0.25 in theory [38], leading to the relationship matrix A with ones on the diagonal and 0.5 elsewhere.

To test whether any single SNP was significantly associated with the leaf traits, an F statistic was used under the null hypothesis MVec(B) = 0 for a full-rank q×pt matrix M, as

F=1q(MVec(B))[M((ItX)V1(ItX))1M]1(MVec(B)) (4)

with q numerator degrees of freedom and t(np) denominator degrees of freedom [39]. The p-values from the F statistics (4) for each SNP are prone to genomic inflation [40, 41]. It is necessary to calculate the genomic inflation factor (λGC) to evaluate the inflation level. When there was no genomic inflation, the p-value threshold for testing significant SNPs was determined based on Bonferroni correction at the 0.05 significance level.

The proportion of phenotype variance explained (PVE) by a single SNP was calculated as

R2=1RSSRSS0 (5)

where RSS0 and RSS are the residual sums of squares under the null hypothesis model and the full model (2), respectively [42].

As a comparison with the regular radius data, we also used mvLMM (2) to perform GWAS for the multiple traits of L, W, W1/3, W1/2, and W2/3 (LWs) (Fig 2C). For a single leaf trait such as length, width, and area, the GWAS was conducted with univariate LMM, which can be derived by simplifying the multivariate model (2) and is expressed as

y=Xβ+Zg+e (6)

where y is a vector of trait values for n individuals; X is a design matrix of fixed effects; β is a vector of fixed effects; g is a vector of random genetic effects for each clone with gN(0,σg2Ic); Z is the coefficient matrix corresponding to the random vector g; e is the random residual vector with eN(0,σe2In). Moreover, we calculated the narrow heritability of a single trait as h2=σg2/(σg2+σe2) without incorporating the fixed effects of SNPs in model (6).

To calculate the restricted maximum likelihood (REML) estimates of genetic parameters, we applied the function emmremlMultivariate for the multivariate model (2) and emmreml for the univariate model (6) in the R package EMMREML (https://cran.r-project.org/web/packages/EMMREML). After the genetic parameters were calculated, the p-value for testing each SNP was calculated according to the F statistic, as in Eq (4).

Investigation of candidate genes

In our previous study [43], the average LD block length was estimated to be ~650 bp in the same hybrid population, which is so short that it could not be properly used as a downstream or upstream range for investigating candidate genes for the significant SNPs. Alternatively, we took the strategy as described in Slaten et al [44]. In brief, we considered candidate genes that contain significant SNPs or are within a LD block harboring significant SNPs. If a significant SNP is within an intergenic region and does not form a LD block, both the closest downstream and upstream genes are considered as candidates. Because no annotation information for leaf shape is available in the gene annotation database of P. trichocarpa in Phytozome (v4.1; https://genome.jgi.doe.gov), the coding sequences (CDSs) of these genes were obtained for further annotating. We first performed BLAST searches with their CDSs against the nonredundant protein database [45, 46] and then mapped all BLAST hits to Gene Ontology (GO) terms based on ID mapping information from http://ftp.pir.georgetown.edu/databases/idmapping/idmapping.tb.gz. The descriptions of the blast hits and GO terms were saved in an Excel file in which we could search which genes were possibly related to leaf shape.

Results

Leaf trait data

We successfully obtained the leaf trait data for a total of 2,244 individual trees belonging to 163 clones in the RCBD (S2 Table). Some plots had missed samples due to the damage from pest, disease, poor rooting ability, or other unknown reasons. To validate these measurements, we measured 100 randomly chosen leaves with ImageJ [47] and LeafShape software separately. The average relative differences in the leaf length, width, and area values measured from the two software programs were 1.45 (±0.99)%, 4.76 (±0.76)%, and 5.05 (±0.92)%, respectively, indicating that the two measurements from both methods were largely consistent (S3 Table). The descriptive statistics for the traits L, W, W1/3, W1/2, W2/3, A, and the L/W ratio are presented in Table 1, including the mean, standard deviation, range, and coefficient of variation (CV). The CVs for the leaf length and different leaf widths were similar, ranging from 20.79% for L to 25.14% for W2/3, while the CV for leaf area reached a maximum value of 42.25% and the CV for the length/width ratio had a minimum value of 10.12%. The histograms showed that these leaf traits basically followed a normal distribution (S1 Fig). Furthermore, the heritabilities of leaf length and different leaf widths as well as leaf area were similar (40~50%), but the heritability of the length/width ratio was much higher at 64.74% (Table 1). In addition, correlation analysis showed that the leaf length, measures of leaf width, and leaf area were significantly positively correlated (p < 0.01) with each other, with most coefficients over 0.90; the minimum coefficient value was 0.8137 between L and W2/3 (S4 Table). However, the L/W ratio was significantly negatively correlated with each of the leaf length, different leaf widths, and leaf area traits, with coefficients between -0.6160 and -0.2389. Finally, analysis of variance showed that the effects of each leaf trait, L, W, W1/3, W1/2, W2/3, and A, were significantly different among blocks and clones (S5 Table).

Table 1. Variation in leaf length, different leaf widths, leaf area, and the ratio of length/width in the F1 progeny of Populus deltoides × Populus simonii based on a randomized complete block design.

Trait (Unit) Mean SD Range CV (%) Heritability (%)
L (mm) 114.27 23.76 36.31 ~ 193.12 20.79 41.56
W (mm) 90.79 21.90 23.72 ~ 163.03 24.12 46.46
W1/3 (mm) 89.95 21.89 20.49 ~ 157.89 24.34 45.65
W1/2 (mm) 83.80 19.83 23.72 ~ 147.72 23.66 46.67
W2/3 (mm) 64.97 16.33 16.76 ~ 131.03 25.14 49.53
A (mm2) 7317.88 3091.52 588.66 ~ 23514.73 42.25 49.52
L/W 1.28 0.13 0.91 ~1.92 10.12 64.74

Notes: L, leaf length; W, maximum leaf width; W1/3, leaf width at one-third length; W1/2, leaf width at half length; W2/3, leaf width at two-thirds length; L/W: The ratio of the leaf length to the maximum leaf width.

The 360 polar radii (Fig 2D) of all leaves were obtained with the R package LeafShape as a full dataset denoted as RD360. We also extracted 5 reduced datasets denoted as RD06, RD09, RD11, RD16, and RD61 that contained 6, 9, 11, 16, and 61 regular polar radii of each leaf on the right side from −π/2 to π/2, respectively; these datasets were expected to represent the leaf shape characteristics despite the lower dimensionality of the data due to leaf symmetry. Canonical correlation analysis showed that each of the commonly measured leaf traits, such as length, width, and area, was highly correlated with the polar radii in the RD360, RD61, RD16, RD11, RD09, and RD06 datasets, with a correlation coefficient value of over 0.98 and a p-value less than 0.0001 (S6 Table). Additionally, the multiple traits of leaf length and different leaf widths (LWs: L, W, W1/3, W1/2, and W2/3) were extremely significantly correlated with the 6 radius datasets, with the first canonical correlation coefficient greater than 0.9996 (p<0.0001). Moreover, the principal component analysis of the polar radius traits revealed that the proportion of total variance for the first principal component was at least 95.59% for the 6 radius datasets. The first principal component for each radius dataset was highly correlated with leaf length, different leaf widths, and leaf area, with coefficients greater than 0.90 and p-values less than 0.0001 (S7 Table).

SNP genotype data

A total of 33,086 SNPs across the 163 clones were obtained by mapping their high-quality PE reads separately to the reference genome of P. trichocarpa (v4.1; https://genome.jgi.doe.gov). For a SNP genotype of each clone at each SNP site, the heterozygous allele was required to have a coverage depth of at least 3 reads, whereas the coverage depth for a homozygous allele was at least 5. Furthermore, the quality of each genotype needed a Phred score of at least 30, and the missing genotype rate at each SNP was set to less than 10%. All the SNPs were categorized into five segregation types, aa×ab, aa×ac, ab×aa, ab×ab, and ab×cc (Table 2). The majority of SNPs segregated at a ratio of 1:1 (p > 0.01) with aa×ab and ab×aa types.

Table 2. Summary of SNPs obtained across the 163 clones based on a randomized complete block design.

Segregation type Ratio Number
aa×ab 1:1 13,385
aa×bc 1:1 76
ab×aa 1:1 19,295
ab×cc 1:1 159
ab×ab 1:2:1 171
Total 33,086

The LD analysis of these SNPs was performed with the software PLINK [36], resulting in 10,735 independent SNP markers and LD blocks. Therefore, the p-value threshold for significant SNPs in our genome-wide analyses was set to 0.05/10735 = 4.66E-6 (-log10(p-value) = 5.33) based on the Bonferroni correction at the 0.05 significance level.

Significant SNPs associated with leaf traits

mvLMM (2) was applied to perform the GWAS for the multiple traits of the regular polar radius datasets RD06, RD09, RD11, RD16, and RD61 separately, as well as the multiple traits of LWs. The quantile-quantile plots of the p-values on base 10 logarithm scale showed that there existed different levels of genomic inflation, with inflation factors greater than 1 for datasets RD06, RD09, and LWs; less than 1 for datasets RD16 and RD61; and almost equal to 1 for dataset RD11 (Fig 3). Because the result from dataset RD11 showed good genomic control, we used this result to determine the significant SNPs associated with leaf shape. Consequently, a total of 35 SNPs were found to be significantly associated with the multiple traits of leaf shape under the p-value threshold of 4.66E-6, each explaining 0.18–0.32% of the phenotypic variance (Table 3). Fig A shows the Manhattan plot of the negative base 10 logarithm of the p-value against the corresponding SNP position. These significant SNPs were distributed on 8 chromosomes, with a few SNPs on chromosomes 4, 6, 10, 15, 16, and 18 but more on chromosome 14. There were 11 significant SNPs detected on chromosome 1, of which the first 10 were within a 3-Mb region. More surprisingly, chromosome 14 harbored the most (16) significant SNPs, which could be divided into five regions according to the position where the negative logarithm of the p-value changed from decreasing to increasing (Fig 4B).

Fig 3. Quantile-quantile plots of observed p-values versus expected p-values on a base 10 logarithm scale with genomic inflation factors for GWAS of the different multiple traits representing the poplar leaf shape.

Fig 3

LWs indicate the multiple traits of leaf length and 4 different leaf widths. RD06, RD09, RD11, RD16, and RD61 indicate the multiple traits of 6, 9, 11, 16, and 61 regular polar radii between the leaf centroid and edge points across −π/2 to π/2, respectively.

Table 3. Summary of the significant SNPs associated with the leaf shape represented by the multiple trait dataset RD11.

Chr Position/ Number P-Value PVE Gene Descriptiona
Region (%)
1 1561634 10 1.04E-6 0.20 Potri.001G044300; photosynthetic NDH subunit; response to light stimulus; response to light intensity; photosynthesis, light harvesting; regulation of auxin polar transport; MYB-like protein; transcription factor TCP20; photosynthetic NDH subunit; auxin-responsive protein; leaf senescence
2506253 1.07E-8 0.24 Potri.001G049900;
2703246 7.32E-8 0.22 Potri.001G051300;
2742828 2.66E-12 0.32 Potri.001G056700;
3168327 9.98E-7 0.20 Potri.001G059100;
3176353 1.27E-10 0.28 Potri.001G059800;
3620732 4.01E-6 0.18 Potri.001G060000;
3654397 1.05E-9 0.26 Potri.001G060200;
4255086 1.18E-8 0.24 Potri.001G060400;
4573250 1.1E-7 0.22 Potri.001G060900
1 11059517 1 5.76E-9 0.25 Potri.001G135925; integral component of membrane; cellular response to phosphate starvation
Potri.001G135950
4 15591673 1 2.02E-10 0.28 Potri.004G134200; VQ motif-containing protein; response to water deprivation
Potri.004G134300
6 12471729 1 1.56E-10 0.28 Potri.006G146400; mRNA cleavage factor complex; mitogen-activated protein kinase
Potri.006G146500
6 25163980 1 1.05E-9 0.26 Potri.006G253700; SNARE-like superfamily protein; ethylene-responsive transcription factor
Potri.006G253800
10 4486477 1 6.14E-7 0.20 Potri.010G030200; cellular manganese ion homeostasis; integral component of membrane
Potri.010G030400
14 501461 4 5.3E-7 0.20 Potri.014G000700; MYB family transcription factor PHL6 isoform; transcription factor MYB44; L10-interacting MYB domain-containing protein; photosynthesis, light harvesting; protein weak chloroplast movement under blue light
988295 8.81E-8 0.22 Potri.014G022500;
1554847 1.61E-9 0.26 Potri.014G026000;
1943593 6.99E-9 0.25 Potri.014G029700;
Potri.014G029800;
14 2326098 5 9.41E-8 0.22 Potri.014G034500; regulation of leaf morphogenesis; transcription factor MYB44-like; response to auxin; MYB-related protein MYBAS1-like; L10-interacting MYB domain-containing protein; Cpn60_TCP1 domain-containing protein
2609112 2.14E-9 0.26 Potri.014G035100;
3467134 1.86E-7 0.21 Potri.014G039800;
3546254 1.26E-11 0.30 Potri.014G054700;
3729803 7.5E-10 0.27 Potri.014G056100;
Potri.014G058500;
14 4014340 4 3.04E-6 0.19 Potri.014G061450; L10-interacting MYB domain-containing protein; response to red or far red light; regulation of leaf development; auxin-responsive protein; response to red or far red light; response to light stimulus; response to absence of light; protein spotted leaf 11-like
4309715 9.39E-11 0.29 Potri.014G066600;
4316988 7.08E-7 0.20 Potri.014G066700;
4564789 2.45E-7 0.21 Potri.014G066900;
Potri.014G067600;
Potri.014G073700;
Potri.014G075500;
Potri.014G076500
14 5495011 1 2.32E-9 0.26 Potri.014G081200; MYB domain-containing protein; regulation of leaf morphogenesis; MYB family transcription factor
Potri.014G087700;
Potri.014G089300
14 6606004 2 6.61E-8 0.22 Potri.014G096300; MYB-like transcription factor; auxin response factor; protein kinesin light chain-related 1 iosform; transcription factor MYB8-like; spotted leaf protein; auxin-responsive protein; leaf senescence; leaf development
7047979 3.23E-8 0.23 Potri.014G100100;
Potri.014G100400;
Potri.014G100800;
Potri.014G102000;
Potri.014G103300;
Potri.014G103500;
Potri.014G103900
15 6867668 1 3.87E-7 0.21 Potri.015G052600; calcium ion binding; protein heterodimerization activity
Potri.015G052800
16 3663208 1 4.44E-7 0.21 Potri.016G055200; integral component of membrane; accelerated cell death 11
Potri.016G055300
18 4076713 1 8.32E-7 0.20 Potri.018G046800; histidine-containing phosphotransfer protein; zinc finger family protein
Potri.018G046900
18 15471331 1 1.10E-6 0.20 Potri.018G145568 NBS-LRR type disease resistance protein

a The descriptions were chosen from the annotations in S10 Table.

Fig 4. Manhattan plot of the association analysis for the 11 regular polar radii between the leaf centroid and edge points from -π/2 to π/2.

Fig 4

(A) The plot shows the 19 chromosomes of the reference genome of P. trichocarpa. The horizontal dashed line indicates the genome-wide significance threshold of 5.33, which is a base 10 logarithm of the p-value based on the Bonferroni correction at the 0.05 significance level. (B) The significant SNPs on chromosome 14 were divided into five regions roughly according to the position where the negative logarithm of the p-value changes from decreasing to increasing.

Moreover, LMM (6) was applied to detect associations with each single leaf trait, such as leaf length, width and area. The results showed that the genomic inflation factors for these traits ranged from 1.774 for W2/3 to 2.461 for the L/W ratio (Fig 5). After genomic control, we found that there were no significant SNPs associated with any single trait under the p-value threshold based on Bonferroni correction (Fig 5). However, without genomic control, various numbers of significant SNPs were found for these traits: only one significant SNP each was detected for W, W1/3, and A; 6, 8, and 33 SNPs were detected for W1/2, W2/3, and the L/W ratio, respectively; and no significant SNPs were detected for L (S8 and S9 Tables; S2 Fig). The SNP at position 10669990 on chromosome 10 was a common SNP significantly associated with the four different leaf widths, and the significant SNPs for the width at two-thirds length shared all but one significant SNP with the width at half length. In addition, the most significant SNP sites or regions for the ratio of leaf length to leaf width were consistent with those for the multiple traits of leaf length and four different leaf widths, except for the two significant SNPs on chromosomes 2 and 13.

Fig 5.

Fig 5

Manhattan and quantile-quantile (QQ) plots of the association analyses for each univariate trait, L (A), W (B), W31 (C), W21 (D), W32 (E), and A (F), and the ratio of L to W (G) across the 19 chromosomes of the reference genome of P. trichocarpa. The left panel presents the Manhattan plots under genomic control, while the right panel shows the corresponding QQ plots before (blue) and after (green) genomic control. The horizontal dashed line indicates the genome-wide significance threshold of 5.33, which is a base 10 logarithm of the p-value based on the Bonferroni correction at the 0.05 significance level.

Candidate genes affecting leaf shape

The candidate genes of the significant SNPs for the multiple traits of the 11-dimensional regular polar radii data were annotated with the nonredundant protein database at the NCBI and GO databases (S10 Table). One significant SNP region on chromosome 1 and five on chromosome 14 were found to harbor a total of 40 candidate genes functionally related to leaf shape (Table 3). However, the rest 9 significant SNPs on chromosome 1, 4, 6, 10, 15, 16, and 18 had no candidate genes that have descriptions directly related to leaf shape, possibly due to the reason that each of them did not form a LD block with other SNPs and thus had at most two candidate genes. We found that there are 8 candidate genes in 5 significant SNP regions, which directly affect leaf growth and development, with descriptions such as “leaf development” and “regulation of leaf morphogenesis”. It was also noticed that there are 6 candidate genes on chromosomes 1 and 14 related to the hormone auxin, which plays important roles in initial leaf formation, lamina margin elaboration, and leaf vasculature patterning [4851]. Moreover, 12 candidate genes were found to belong to MYB gene family, which was previously reported to be involved in leaf development in Arabidopsis [52] and maize [53]. Furthermore, 2 candidate genes on chromosomes 1 and 14 are related to TCP genes, which were found to be involved in leaf development and morphology in Arabidopsis [54, 55]. In addition, 14 candidate genes were related to light responses or photosynthesis in 5 significant SNP regions distributed on chromosomes 1 and14; these genes are involved in activities such as response to light intensity, light harvesting, and photosynthesis. Undoubtedly, these genes play important roles in leaf development and pattern formation.

Discussion

Leaf size and shape are the most important traits during the development and growth of Populus. Understanding the genetic mechanism of these traits is of great interest to many poplar breeders. In the present study, we successfully detected dozens of SNPs significantly associated with the multiple traits of the 11-dimensional regular leaf polar radii in a randomized complete block test with clones from the F1 hybrids of P. deltoides and P. simonii. Multiple traits could be considered to represent the leaf shape because the regular polar radii on the right side largely reflect the two-dimensional pattern of the leaf. Compared with previous studies for identifying QTLs or SNPs associated with leaf shape in Populus (see Introduction), we were able to identify many more QTLs or significant SNP regions. One of the main reasons for the powerful ability to identify the associated SNPs may be attributed to the use of the RCBD in the current GWAS. This kind of test design provided replicates of clones not only at the block level but also at the plot level, allowing thousands of individuals to be used for the association analysis. From a statistical perspective, the repeated phenotype data for each genotype that originated from a single seed can control for the spatial effects in the field and reduce systematic errors, hence improving the accuracy and power of GWAS. In contrast, in previous GWAS or QTL mapping studies on poplar leaf traits, phenotype data were measured from single plants with different genotypes in natural populations or full-sib families, possibly limiting QTL detection power.

Another advantage of our association analysis strategy may be due to incorporating the multiple traits of leaf polar radii into the mvLMM for GWAS. Although mvLMMs have become increasingly important in GWAS because of their power gain over univariate analysis, the computation of genetic parameter estimates is nontrivial [56]. We successfully implemented the parameter and statistical calculations with the flexible R package EMMREML by adding or modifying some codes. Consequently, the mvLMM helped identify many more significant SNPs associated with leaf traits without genomic deflation. In contrast, after genomic control, the univariate LMM did not have the ability to detect any significant SNPs for any single trait, such as leaf length and width (Fig 5). Even if genomic deflation was permitted, we could see that fewer than 10 significant SNPs were detected for the single traits W, W1/3, W1/2, W2/3, and A, whereas no significant SNPs were detected for L (S8 Table). However, in such cases, the number of significant SNPs dramatically increased to 33 for the ratio of leaf length to width but was still less than the number detected based on the multiple traits of the 11-dimensional regular leaf polar radii dataset (S9 Table). The fact that more SNPs were detected for the ratio of leaf length to leaf width than for the other single traits may largely be due to much higher heritability of this trait (Table 1). This phenomenon can also be found in a previous study [11], where the authors identified 2 QTLs for leaf length and 2 for width but 5 for the ratio of the two traits.

Although our association analysis of the multiple traits based on the mvLMM was able to identify many more significant SNPs, it seems that the PVE of each SNP was much lower, ranging from 0.18 to 0.32% (Table 3). An intuitive explanation for this result is that leaf shape is possibly controlled by many genes with small effects, conforming to the infinitesimal model [57]. This explanation could further confirm that our strategy for GWAS in the current study is powerful for detecting such small-effect genes. This phenomenon may be the main reason why previous studies had a lower power for locating QTLs for single leaf traits, with only a few detected, although the PVEs of the QTLs were apparently larger than those estimated in this study [11, 25, 28]. However, the PVEs of SNPs or QTLs cannot be compared directly because they are calculated based on not only different population structures but also different statistical models. Even in the same study using the same statistical model, the PVE may or may not consistently increase or decrease with the corresponding statistical value for determining the significance of the hypothesis test. This is because the estimates of the environmental variance vary for different SNPs or QTLs, possibly leading to inconsistencies between the PVEs and statistical values. This phenomenon can be commonly found in the literature. For example, in Drost et al. [11], the first QTL for lamina length had a PVE value of 6.31% with a LOD value of 3.14, while the PVE of the second QTL was 8.10% with a lower LOD value of 2.68. In addition to these factors, the most important consideration is how to calculate the PVE based on a statistical model. For most fixed linear models with uncorrelated phenotype data, the R2 statistic is generally used to measure the PVE in QTL mapping studies or GWAS. However, for mixed linear models, such measurements are not well established [58]. Here, we calculated the R2 statistic as Eq (5) based on the weighted residual sum of squares [59].

It is worth emphasizing that the 11-dimensional multivariate data of the regular leaf polar radii can largely represent the poplar leaf shape and can be applied in association analyses with SNPs for such traits that are difficult to measure. Naturally, it was believed that the higher the dimensionality of the radius data between the leaf centroid and edge points is, the better the characteristics of leaf shape can be represented (Fig 2D). Fu et al. [27] first implemented such an idea by extracting 360 coordinates on leaf outlines from scanned images and performed a series of association analyses with the leaf shape [28, 60]. We also performed GWAS of leaf shape with different dimensions of the radius data (e.g., RD61, RD16), which were extracted by our own R package (https://github.com/tongchf/LeafShape) because Fu et al. did not provide public software for the task. However, our results showed that for the higher dimensional data (i.e., RD61 and RD16), genomic deflation existed with λGC≤0.820, while for the lower dimensional data (i.e., RD09 and RD06), genomic inflation existed with λGC≥1.120 (Fig 3). In contrast, the RD11 data presented a balanced result between genomic inflation and deflation, exhibiting the best performance regarding genomic control in the GWAS with different dimensional data of the regular leaf polar radii.

Compared with previous studies for poplar leaf shape, we found that there were a few overlapping regions (<5 Mb) containing significant SNPs or QTLs. S11 Table lists those significant SNPs or QTLs associated with leaf shape in the current study and in four recent studies [17, 18, 26, 61], excluding those previous QTL studies in which no physical QTL position information was available [11, 25, 27, 28]. The results in the previous studies for single leaf traits such leaf length and width were not considered because we thought that the leaf shape could not be described by a single leaf parameter. We found that there were 7 significant SNPs detected in our study very close (<5 Mb) to one or more SNPs identified in previous studies, of which 5 were consistent with Xia et al. [61], 4 with Chhetri et al. [17], 1 with McKown et al. [26], and 1 with Chhetri et al. [18]. In contrast, 5 overlapping regions were found between the four previous studies. It is interesting to find that 3 regions on chromosome 4, 6, and 8 were coincidentally detected for leaf shape in three studies. Although our GWAS findings have more consistent SNPs with the previous results, most SNPs identified in the current and previous studies did not share an overlapping region. This result may be due to many reasons, but one of the main reasons is that different methods were used to describe the complex trait of leaf shape in the GWAS or QTL studies. Drost et al. [11] described the leaf shape with the ratio of leaf length to width, while Chhetri et al. [17, 18] described it with the combination of leaf area (LA), leaf dry weight (LD), leaf length (LL) and leaf width (LW) or the combination of leaf aspect ratio (AR) and specific leaf area (SL). However, based on the method of Fu et al. [27, 28], we used high-dimensional regular polar radii data to describe the leaf shape.

Conclusion

The novel strategy for GWAS with direct integration of the traditional randomized complete block design and the multiple traits of regular leaf polar radii into the multivariate linear mixed model facilitated the identification of many more significant SNPs associated with leaf shape in Populus than previous studies have detected. Moreover, it was demonstrated that the multivariate linear mixed model was more powerful than the univariate linear mixed model in the association analyses for leaf traits such as leaf length, width, and area. Most flanking regions surrounding significant SNPs harbored potential candidate genes that were related to the growth and development of the poplar leaf. Our results enhance the understanding of the molecular mechanism underlying leaf morphological variation in Populus. In addition, the multivariate data from a moderate number of regular leaf polar radii could largely represent the leaf shape and exhibited better genomic control in the GWAS of poplar leaf shape.

Supporting information

S1 Fig

Histograms with probability density curves (red) of normal distributions for each univariate trait of L (A), W (B), W31 (C), W21 (D), W32 (E), A (F), and the ratio of L to W (G) in the randomized complete block design derived from the F1 progeny of Populus deltoides × Populus simonii.

(DOCX)

S2 Fig

Manhattan plots of the association analyses without genomic control for each univariate trait of L (A), W (B), W31 (C), W21 (D), W32 (E), A (F), and the ratio of L to W (G) across the 19 chromosomes of the reference genome of P. trichocarpa. The horizontal dashed line indicates the genome-wide significant threshold of 5.33, a base 10 logarithm of p-value based on the Bonferroni correction at the 0.05 significant level.

(DOCX)

S1 Table. The RADseq data information for the 2 parents and 163 progeny in the F1 hybrid population of Populus deltoides and Populus simonii.

(DOCX)

S2 Table. The raw data of leaf length, different widths and area from a randomized complete block design in Populus.

(XLSX)

S3 Table. Relative difference between leaf measurements with the two software ImageJ and LeafShape.

(XLSX)

S4 Table. Correlation coefficients among the leaf traits of L, W, W31, W21, W32, A, and the ratio of L to W in the randomized complete block design derived from the F1 progeny of Populus deltoides × Populus simonii.

(DOCX)

S5 Table. Analysis of variance for the leaf parameters of L, W, W1/3, W1/2, W2/3, and Area in the randomized complete block experiment derived from the F1 progeny of Populus deltoides × Populus simonii.

(DOCX)

S6 Table. Canonical correlation coefficients among the leaf length, widths, area, the length/width ratio, and polar radii in the randomized complete block design derived from the F1 progeny of Populus deltoides × Populus simonii.

(DOCX)

S7 Table. Correlation coefficients between the first principal component of different radius datasets and the leaf length, different widths, or area in the randomized complete block design derived from the F1 progeny of Populus deltoides × Populus simonii.

(DOCX)

S8 Table. Summary of significant SNPs associated to each trait of the four different leaf widths and area without genomic control.

(DOCX)

S9 Table. Summary of significant SNPs associated to the ratio of the leaf length to the maximum width without genomic control.

(DOCX)

S10 Table. The annotation of candidate genes for the significant SNPs with the non-redundant protein database at NCBI and GO database.

(XLSX)

S11 Table. Consistency between significant SNPs for poplar leaf shape identified in the current and previous studies.

The distance of two close SNPs between two different studies is presented in brackets.

(XLSX)

Acknowledgments

We thank Professor Huogen Li in Nanjing Forestry University for his great help in establishing the randomized complete block design.

Data Availability

The RADseq data is available in the SRA database at NCBI (http://www.ncbi.nlm.nih.gov/Traces/sra) with the accession numbers listed in S1 Table. Other relevant data are within the paper and its Supporting Information files.

Funding Statement

Funding for this research was provided by the National Natural Science Foundation of China (No. 31870654 and 31270706) and the Priority Academic Program Development of the Jiangsu Higher Education Institutions (PAPD).The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Field CB, Behrenfeld MJ, Randerson JT, Falkowski P. Primary production of the biosphere: integrating terrestrial and oceanic components. Science. 1998;281(5374):237–40. doi: 10.1126/science.281.5374.237 . [DOI] [PubMed] [Google Scholar]
  • 2.Fleming AJ. The control of leaf development. New Phytol. 2005;166(1):9–20. doi: 10.1111/j.1469-8137.2004.01292.x . [DOI] [PubMed] [Google Scholar]
  • 3.Xu F, Guo W, Xu W, Wei Y, Wang R. Leaf morphology correlates with water and light availability: What consequences for simple and compound leaves? Progress in Natural Science. 2009;19(12):1789–98. doi: 10.1016/j.pnsc.2009.10.001 WOS:000271902400017. [DOI] [Google Scholar]
  • 4.Liao F, Peng J, Chen R. LeafletAnalyzer, an automated software for quantifying, comparing and classifying blade and serration features of compound leaves during development, and among induced mutants and natural variants in the legume Medicago truncatula. Front Plant Sci. 2017;8. doi: 10.3389/fpls.2017.00915 WOS:000402386300001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Krieger JD, Guralnick RP, Smith DM. Generating empirically determined, continuous measures of leaf shape for paleoclimate reconstruction. PALAIOS. 2007;22(2):212–9. doi: 10.2110/palo.2005.p05-079r WOS:000248060700010. [DOI] [Google Scholar]
  • 6.Peppe DJ, Royer DL, Cariglino B, Oliver SY, Newman S, Leight E, et al. Sensitivity of leaf size and shape to climate: global patterns and paleoclimatic applications. New Phytologist. 2011;190(3):724–39. doi: 10.1111/j.1469-8137.2010.03615.x [DOI] [PubMed] [Google Scholar]
  • 7.Byrne ME. Networks in leaf development. Curr Opin Plant Biol. 2005;8(1):59–66. doi: 10.1016/j.pbi.2004.11.009 . [DOI] [PubMed] [Google Scholar]
  • 8.Barkoulas M, Hay A, Kougioumoutzi E, Tsiantis M. A developmental framework for dissected leaf formation in the Arabidopsis relative Cardamine hirsuta. Nat Genet. 2008;40(9):1136–41. doi: 10.1038/ng.189 . [DOI] [PubMed] [Google Scholar]
  • 9.Tsuge T, Tsukaya H, Uchimiya H. Two independent and polarized processes of cell elongation regulate leaf blade expansion in Arabidopsis thaliana (L.) Heynh. Development. 1996;122(5):1589–600. . [DOI] [PubMed] [Google Scholar]
  • 10.Narita NN, Moore S, Horiguchi G, Kubo M, Demura T, Fukuda H, et al. Overexpression of a novel small peptide ROTUNDIFOLIA4 decreases cell proliferation and alters leaf shape in Arabidopsis thaliana. Plant J. 2004;38(4):699–713. doi: 10.1111/j.1365-313X.2004.02078.x . [DOI] [PubMed] [Google Scholar]
  • 11.Drost DR, Puranik S, Novaes E, Novaes CR, Dervinis C, Gailing O, et al. Genetical genomics of Populus leaf shape variation. BMC Plant Biol. 2015;15:166. doi: 10.1186/s12870-015-0557-7 ; PubMed Central PMCID: PMC4486686. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Holtan HEE, Hake S. Quantitative trait locus analysis of leaf dissection in tomato using Lycopersicon pennellii segmental introgression lines. Genetics. 2003;165(3):1541–50. doi: 10.1093/genetics/165.3.1541 WOS:000187459100048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Juenger T, Perez-Perez JM, Bernal S, Micol JL. Quantitative trait loci mapping of floral and leaf morphology traits in Arabidopsis thaliana: evidence for modular genetic architecture. Evolution & Development. 2005;7(3):259–71. doi: 10.1111/j.1525-142X.2005.05028.x WOS:000228690100010. [DOI] [PubMed] [Google Scholar]
  • 14.Li F, Kitashiba H, Inaba K, Nishio T. A Brassica rapa linkage map of EST-based SNP markers for identification of candidate genes controlling flowering time and leaf morphological traits. DNA Research. 2009;16(6):311–23. doi: 10.1093/dnares/dsp020 WOS:000272832700001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Fu Y, Xu G, Chen H, Wang X, Chen Q, Huang C, et al. QTL mapping for leaf morphology traits in a large maize-teosinte population. Molecular Breeding. 2019;39(7). doi: 10.1007/s11032-019-1012-5 WOS:000473170500002. [DOI] [Google Scholar]
  • 16.Du B, Liu L, Wang Q, Sun G, Ren X, Li C, et al. Identification of QTL underlying the leaf length and area of different leaves in barley. Sci Rep. 2019;9(1):4431. doi: 10.1038/s41598-019-40703-6 ; PubMed Central PMCID: PMC6418291. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Chhetri HB, Macaya-Sanz D, Kainer D, Biswal AK, Evans LM, Chen JG, et al. Multitrait genome-wide association analysis of Populus trichocarpa identifies key polymorphisms controlling morphological and physiological traits. New Phytologist. 2019;223(1):293–309. doi: 10.1111/nph.15777 WOS:308432139100027. [DOI] [PubMed] [Google Scholar]
  • 18.Chhetri HB, Furches A, Macaya-Sanz D, Walker AR, Kainer D, Jones P, et al. Genome-Wide Association Study of Wood Anatomical and Morphological Traits inPopulus trichocarpa. Front Plant Sci. 2020;11. doi: 10.3389/fpls.2020.545748 WOS:000574412000001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Bylesjo M, Segura V, Soolanayakanahally RY, Rae AM, Trygg J, Gustafsson P, et al. LAMINA: a tool for rapid quantification of leaf size and shape parameters. BMC Plant Biol. 2008;8:82. doi: 10.1186/1471-2229-8-82 ; PubMed Central PMCID: PMC2500018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Geraldes A, Pang J, Thiessen N, Cezard T, Moore R, Zhao Y, et al. SNP discovery in black cottonwood (Populus trichocarpa) by population transcriptome resequencing. Mol Ecol Resour. 2011;11(Suppl 1):81–92. doi: 10.1111/j.1755-0998.2010.02960.x . [DOI] [PubMed] [Google Scholar]
  • 21.Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, et al. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science. 2006; 313:1596–604. doi: 10.1126/science.1128691 [DOI] [PubMed] [Google Scholar]
  • 22.Eckenwalder JE. Systematics and evolution of Populus. In: Stettler RF, Bradshaw HD, Heilman PE, Hinckley TM, editors. Biology of Populus and its implications for management and conservation. Ottawa: NRC Research Press, National Council of Canada; 1996. p. p. 7–32. [Google Scholar]
  • 23.Stettler R, Bradshaw HD, Heilman PE, Hinckley TM. Biology of Populus and its implications for management and conservation. Ottawa: NRC Research Press; 1996. [Google Scholar]
  • 24.Marron N, Ceulemans R. Genetic variation of leaf traits related to productivity in a Populus deltoides x Populus nigra family. Canadian Journal of Forest Research. 2006;36(2):390–400. doi: 10.1139/x05-245 WOS:000237199900012. [DOI] [Google Scholar]
  • 25.Wu R, Bradshaw H, Stettler R. Molecular genetics of growth and development in Populus (Salicaceae). v. mapping quantitative trait loci affecting leaf variation. American Journal of Botany. 1997;84(2):143–53. doi: 10.2307/2446076 MEDLINE:. [DOI] [PubMed] [Google Scholar]
  • 26.McKown AD, Klapste J, Guy RD, Geraldes A, Porth I, Hannemann J, et al. Genome-wide association implicates numerous genes underlying ecological trait variation in natural populations of Populus trichocarpa. New Phytologist. 2014;203(2):535–53. doi: 10.1111/nph.12815 WOS:000337639800019. [DOI] [PubMed] [Google Scholar]
  • 27.Fu G, Bo W, Pang X, Wang Z, Chen L, Song Y, et al. Mapping shape quantitative trait loci using a radius-centroid-contour model. Heredity. 2013;110(6):511–9. doi: 10.1038/hdy.2012.97 WOS:000319112000002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Fu GF, Huang M, Bo WH, Hao H, Wu RL. Mapping morphological shape as a high-dimensional functional curve. Briefings in Bioinformatics. 2018;19(3):461–71. doi: 10.1093/bib/bbw111 WOS:000432676200009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Tong CF, Li HG, Wang Y, Li XR, Ou JJ, Wang DY, et al. Construction of high-density linkage maps of Populus deltoides × P. simonii using restriction-site associated DNA sequencing. PLoS One. 2016;11(3):e0150692. doi: 10.1371/journal.pone.0150692 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Mousavi M, Tong C, Liu F, Tao S, Wu J, Li H, et al. De novo SNP discovery and genetic linkage mapping in poplar using restriction site associated DNA and whole-genome sequencing technologies. Bmc Genomics. 2016;17:656. doi: 10.1186/s12864-016-3003-9 ; PubMed Central PMCID: PMC4991039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Patel RK, Jain M. NGS QC Toolkit: A toolkit for quality control of next generation sequencing data. PLoS One. 2012;7(2):e30619. doi: 10.1371/journal.pone.0030619 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009; 25:1754–60. doi: 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–9. doi: 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Maliepaard C, Jansen J, Van Ooijen JW. Linkage analysis in a full-sib family of an outbreeding plant species: overview and consequences for applications. Genet Res. 1997;70:237–50. doi: 10.1017/s0016672397003005 [DOI] [Google Scholar]
  • 35.Tong CF, Zhang B, Shi JS. A hidden Markov model approach to multilocus linkage analysis in a full-sib family. Tree Genet Genomes. 2010; 6(5): 651–62. doi: 10.1007/s11295-010-0281-2 [DOI] [Google Scholar]
  • 36.Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics. 2007;81(3):559–75. doi: 10.1086/519795 WOS:000249128200012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Searle SR, Casella G, McCulloch CE. Variance Components. New Jersey, USA: John Wiley & Sons, Inc.; 2006. 458 p. [Google Scholar]
  • 38.Lynch M, Walsh B. Genetics and Analysis of Quantitative Traits. Sunderland, MA, USA: Sinauer Associates, Inc.; 1998. 143–5 p. [Google Scholar]
  • 39.Kang HM, Zaitlen NA, Wade CM, Kirby A, Heckerman D, Daly MJ, et al. Efficient control of population structure in model organism association mapping. Genetics. 2008;178(3):1709–23. doi: 10.1534/genetics.107.080101 ; PubMed Central PMCID: PMC2278096. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Devlin B, Roeder K. Genomic control for association studies. Biometrics. 1999;55(4):997–1004. doi: 10.1111/j.0006-341x.1999.00997.x MEDLINE:. [DOI] [PubMed] [Google Scholar]
  • 41.van Iterson M, van Zwet EW, Heijmans BT, Consortium B. Controlling bias and inflation in epigenome- and transcriptome-wide association studies using the empirical null distribution. Genome Biology. 2017;18. doi: 10.1186/s13059-017-1156-8 WOS:000394827100001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Xu R. Measuring explained variation in linear mixed effects models. Stat Med. 2003;22:3527–41. doi: 10.1002/sim.1572 [DOI] [PubMed] [Google Scholar]
  • 43.Chen Y, Wu H, Yang W, Zhao W, Tong C. Multivariate linear mixed model enhanced the power of identifying genome-wide association to poplar tree heights in a randomized complete block design. G3. 2021;11(2). Epub 2021/02/20. doi: 10.1093/g3journal/jkaa053 ; PubMed Central PMCID: PMC8022933. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Slaten ML, Yobi A, Bagaza C, Chan YO, Shrestha V, Holden S, et al. mGWAS Uncovers Gln-Glucosinolate Seed-Specific Interaction and its Role in Metabolic Homeostasis. Plant Physiol. 2020;183(2):483–500. Epub 2020/04/23. doi: 10.1104/pp.20.00039 ; PubMed Central PMCID: PMC7271782. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. Journal of Molecular Biology. 1990;215(3):403–10. doi: 10.1016/S0022-2836(05)80360-2 . [DOI] [PubMed] [Google Scholar]
  • 46.Pruitt KD, Tatusova T, Maglott DR. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2007;35:D61–5. doi: 10.1093/nar/gkl842 ; PubMed Central PMCID: PMC1716718. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Schindelin J, Rueden CT, Hiner MC, Eliceiri KW. The ImageJ ecosystem: An open platform for biomedical image analysis. Molecular Reproduction And Development. 2015;82(7–8):518–29. doi: 10.1002/mrd.22489 WOS:000358486700004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Reinhardt D, Pesce ER, Stieger P, Mandel T, Baltensperger K, Bennett M, et al. Regulation of phyllotaxis by polar auxin transport. Nature. 2003;426(6964):255–60. doi: 10.1038/nature02081 WOS:000186660800035. [DOI] [PubMed] [Google Scholar]
  • 49.Scanlon MJ. The polar auxin transport inhibitor N-1-naphthylphthalamic acid disrupts leaf initiation, KNOX protein regulation, and formation of leaf margins in maize. Plant Physiology. 2003;133(2):597–605. doi: 10.1104/pp.103.026880 WOS:000185974800022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Hay A, Tsiantis M. The genetic basis for differences in leaf form between Arabidopsis thaliana and its wild relative Cardamine hirsuta. Nature Genetics. 2006;38(8):942–7. doi: 10.1038/ng1835 WOS:000239325700026. [DOI] [PubMed] [Google Scholar]
  • 51.Scarpella E, Marcos D, Friml J, Berleth T. Control of leaf vascular patterning by polar auxin transport. Genes & Development. 2006;20(8):1015–27. doi: 10.1101/gad.1402406 WOS:000236951500011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Sun Y, Zhou Q, Zhang W, Fu Y, Huang H. ASYMMETRIC LEAVES1, an Arabidopsis gene that is involved in the control of cell differentiation in leaves. Planta. 2002;214:694–702. doi: 10.1007/s004250100673 [DOI] [PubMed] [Google Scholar]
  • 53.Theodoris G, Inada N, Freeling M. Conservation and molecular dissection of ROUGH SHEATH2 and ASYMMETRIC LEAVES1 function in leaf development. Proc Natl Acad Sci USA. 2003;100:6837–42. doi: 10.1073/pnas.1132113100 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Kieffer M, Master V, Waites R, Davies B. TCP14 and TCP15 affect internode length and leaf shape in Arabidopsis. Plant J. 2011;68(1):147–58. doi: 10.1111/j.1365-313X.2011.04674.x ; PubMed Central PMCID: PMC3229714. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Aguilar-Martinez JA, Sinha N. Analysis of the role of Arabidopsis class I TCP genes AtTCP7, AtTCP8, AtTCP22, and AtTCP23 in leaf development. Front Plant Sci. 2013;4:406. doi: 10.3389/fpls.2013.00406 WOS:241371718200001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Zhou X, Stephens M. Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nat Methods. 2014;11(4):407–9. doi: 10.1038/nmeth.2848 ; PubMed Central PMCID: PMC4211878. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Hu Z, Wang Z, Xu S. An infinitesimal model for quantitative trait genomic value prediction. PLoS One. 2012;7(7):e41336. doi: 10.1371/journal.pone.0041336 ; PubMed Central PMCID: PMC3399838. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Sun G, Zhu C, Kramer MH, Yang SS, Song W, Piepho HP, et al. Variation explained in mixed-model association mapping. Heredity. 2010;105(4):333–40. doi: 10.1038/hdy.2010.11 . [DOI] [PubMed] [Google Scholar]
  • 59.Buse A. Goodness of fit in generalized least-squares estimation. The American Statistician. 1973;27:106–8. doi: 10.1080/00031305.1973.10479003 [DOI] [Google Scholar]
  • 60.Fu GF, Saunders G, Stevens J. Holm multiple correction for large-scale gene-shape association mapping. BMC Genetics. 2014;15 (Suppl 1):S5. doi: 10.1186/1471-2156-15-S1-S5 WOS:000345651700007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Xia WX, Xiao ZA, Cao P, Zhang Y, Du KB, Wang N. Construction of a high-density genetic map and its application for leaf shape QTL mapping in poplar. Planta. 2018;248(5):1173–85. doi: 10.1007/s00425-018-2958-y WOS:000447030900009. [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Karthikeyan Adhimoolam

13 Aug 2021

PONE-D-21-14578

Multivariate Genome-Wide Association Study of Leaf Shape in a Populus deltoides and P. simonii F1 Pedigree

PLOS ONE

Dear Dr. Tong,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Sep 27 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Karthikeyan Adhimoolam

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Overall the manuscript is clear and well written and the results support the conclusions. It presents an interesting analysis of leaf shape.

I have several specific points that need clarification.

1. The abstract is clear and the results are well presented but it doesn't say why poplar or why leaf shape. A sentance explaining this in the abstract would make it clearer because at present the reason for using poplar isn't really standing out.

2. Line 54 in the introduction- polygenic is a standard term and doesn't really need defining here

3. The QTL analysis needs more detail in the methods section.

4. Line 91 needs another look at the grammar

5. I know this is mentioned in the introduction but in the methods you could make it clear which is the male and female parent in your cross.

6. Line 153 a rather than an SNP

7. Line 186 it isn't clear where the 2,244 samples came from. The methods state 3 blocks and 6 leaves per block

8. Line 239 The sentance needs rewriting, should it be -with few SNPs on chromosomes

Reviewer #2: This manuscript described QTL mapping in poplar. A segregating mapping population was used as plant material, while a GWAS model was employed for data analysis. The result is interesting; however, I do have some concerns. The author should consider the below points.

1. QTL mapping was conducted with segregating population. Did the author considered the population structure (Q) and kinship (K)? How did the mvLMM work? I noticed the model Y= XB+ZG + E, it seems there is similar parameters of Q and K in this model, if so, how did the author calculate them?

2. The author investigated 100 genes for a significant SNPs, why 100? Usually, the physical distance would be used as a threshold.

3. The QTL results seemed not good. The author compared their results with some previous reports. But I found they missed a very similar study. In that study, P. deltoides and P. simonii were also used as parents for a F1 population. See https://doi.org/10.1007/s00425-018-2958-y, is there any overlapping regions between the 2 studies?

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Oct 28;16(10):e0259278. doi: 10.1371/journal.pone.0259278.r002

Author response to Decision Letter 0


19 Aug 2021

Response to Reviewers’ Comments:

Reviewer #1: Overall the manuscript is clear and well written and the results support the conclusions. It presents an interesting analysis of leaf shape.

I have several specific points that need clarification.

1. The abstract is clear and the results are well presented but it doesn't say why poplar or why leaf shape. A sentance explaining this in the abstract would make it clearer because at present the reason for using poplar isn't really standing out.

RE: Thank you very much for your suggestion. We added a sentence in Abstract to address the importance of poplar and its leaf shape as follows: “Most poplar species are of great economic and ecological values and their leaf morphology can be a predictor for wood productivety and environment adaptation”.

2. Line 54 in the introduction- polygenic is a standard term and doesn't really need defining here

RE: Thanks. We deleted the words in the bracket as you suggested.

3. The QTL analysis needs more detail in the methods section.

RE: Thank you for this suggestion. After Line 140, we described the mvLMM for details in a single data form as follows:

(1)

where is the lth polar radius of the kth tree leaf of the jth clone in the ith block; is the overall mean of the lth polar radius; is the effect of the ith block; is the genotype effect of the jth clone at any tested SNP; is the polygenic background effect of the jth clone; and is the residual effect. It is assumed that and are fixed effects, while and are random effects with , and . In matrix form, model (1) can be written as …”.

4. Line 91 needs another look at the grammar:

RE: Thanks. We modified the sentence as “we identified many more SNPs significantly associated with leaf shape than those detected in previous studies”.

5. I know this is mentioned in the introduction but in the methods you could make it clear which is the male and female parent in your cross.

RE: Thank you very much for your suggestion. In Line 103, we changed “P. deltoides and P. simonii” to “the female P. deltoides and the male P. simonii”.

6. Line 153 a rather than an SNP

RE: Thanks. We modified “an SNP” as “any single SNP”.

7. Line 186 it isn't clear where the 2,244 samples came from. The methods state 3 blocks and 6 leaves per block

RE: Thanks for your comments. We inserted a sentence there to explain that there were a few missed samples in some plots as follows: “Some plots had missed samples due to the damage from pest, disease, poor rooting ability, or other unknown reasons”.

8. Line 239 The sentance needs rewriting, should it be -with few SNPs on chromosomes

RE: Thanks for pointing out this grammar error. We modified it as “with a few SNPs on chromosomes …”.

Reviewer #2: This manuscript described QTL mapping in poplar. A segregating mapping population was used as plant material, while a GWAS model was employed for data analysis. The result is interesting; however, I do have some concerns. The author should consider the below points.

1. QTL mapping was conducted with segregating population. Did the author considered the population structure (Q) and kinship (K)? How did the mvLMM work? I noticed the model Y= XB+ZG + E, it seems there is similar parameters of Q and K in this model, if so, how did the author calculate them?

RE: Thank you very much for your comments.

1) Because our samples were from a single full-sib family, the population was so simple that it is improper to apply the population structure (Q) model to the current GWAS. In fact, the mvLMM involved in the kinship matrix (K) with the relationship of A=2K, where A is the additive relationship matrix and the K was estimated from the genetic theory [38]. The relationship of A and K was clearly described in Lines 150-152.

2) Y= XB+ZG + E is the mvLMM model in matrix form, which is very complicated. However, in theory, it is more mature because it belongs to linear model and is widely applied. The F statistics was presented as Equation (3) for testing a single SNP, which can be calculated with the R package EMMREML (https://cran.r-project.org/web/packages/EMMREML). Please see Lines 171-174 for details.

2. The author investigated 100 genes for a significant SNPs, why 100? Usually, the physical distance would be used as a threshold.

RE: Thanks. We took the so-called proximate strategy to investigate the nearby genes of each significant SNP (Monclus et al. 2012; Geng et al. 2015; Su et al. 2017; Vanous et al. 2018). Those genes that have annotations related to the leaf growth and development were chosen as candidate genes, which were listed in Table 3. However, in literatures, there is no conclusion about how many nearby genes should be provided for possible candidates. We chose 100 nearby genes, expecting to find as many as possible candidate genes for a significant SNP. In order to clarify this point, we cited the 4 literatures in Line 177.

Similarly, the threshold of a physical distance could be used to investigate the nearby genes, but the number of candidate genes certainly varied along QTLs.

Monclus et al. 2012. Integrating genome annotation and QTL position to identify candidate genes for productivity, architecture and water-use efficiency in Populus spp. BMC Plant Biology 12:173 DOI 10.1186/1471-2229-12-173.

Geng et al. 2015. A genome-wide association study in catfish reveals the presence of functional hubs of related genes within QTLs for columnaris disease resistance. BMC Genomics 16:196 DOI 10.1186/s12864-015-1409-4.

Su et al. 2017. High density linkage map construction and mapping of yield trait QTLs in maize (Zea mays) using the genotyping-by-sequencing (GBS) technology. Frontiers In Plant Science 8 DOI 10.3389/fpls.2017.00706.

Vanous et al. 2018. Association mapping of flowering and height traits in germplasm enhancement of maize doubled haploid (GEM-DH) lines. Plant Genome 11(2):1-14 DOI 10.3835/plantgenome2017.09.0083.

3. The QTL results seemed not good. The author compared their results with some previous reports. But I found they missed a very similar study. In that study, P. deltoides and P. simonii were also used as parents for a F1 population. See https://doi.org/10.1007/s00425-018-2958-y, is there any overlapping regions between the 2 studies?

RE: Thanks for your comments. Yes, this literature was about QTL study for leaf shape, but it was based on linkage maps and no physical position information was available for the QTLs detected. In Lines 338-339, we wrote “S11 Table lists those significant SNPs associated with leaf shape in the current study and in three recent studies [17, 18, 26], excluding the previous QTL studies because no position information was available on the physical maps for those QTLs related to poplar leaf shape [11, 25, 27, 28]”. Nevertheless, we cited this literature there as reference 63.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Karthikeyan Adhimoolam

7 Sep 2021

PONE-D-21-14578R1

Multivariate Genome-Wide Association Study of Leaf Shape in a Populus deltoides and P. simonii F1 Pedigree

PLOS ONE

Dear Dr. Chunfa Tong,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

==============================

ACADEMIC EDITOR:  Reviewers raised concerns and were not satisfied with the response from the authors. Therefore, I suggest authors writing a detailed response.  Recommended for major revision.

Please ensure that your decision is justified on PLOS ONE’s publication criteria and not, for example, on novelty or perceived impact.

For Lab, Study and Registered Report Protocols: These article types are not expected to include results but may include pilot data. 

==============================

Please submit your revised manuscript by 21 days. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Karthikeyan Adhimoolam

Academic Editor

PLOS ONE

Journal Requirements:

Additional Editor Comments (if provided):

Well, reviewers raised concerns and were not satisfied with the response from the authors. Thus, I suggest authors writing a detailed response. Recommended for major revision.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: All of the comments and suggestions that I made have now been addressed and clearly answered. The manuscript is now much improved after all comments and suggestions have been taken on board.

Reviewer #2: In the last round of review, I asked 3 questions, but the author did not provide the answerers directly. The first question is about the calculation of parameters in the association model. Yes, the equation was provided in the M&M, however, how did you match your data to each parameter. For example, in the GWAS model, it is not right to use all SNPs for Q calculation, but all SNPs should be used for K calculation. So, how did you calculate A and K?

The second question is for determination of candidate genes. Frankly, I don`t think 100 genes around the significant SNP were a right selection. In different regions of genome, the density of genes was different. In linkage-based QTL mapping, 95% confident interval was used, while LD length was used in GWAS. So, I think there should be also a strategy for you to define the candidate QTL region in the model you used. Yes, different strategies, including 100 genes, were used in some published paper, but this doesn`t mean all of them were reasonable.

The third question is for comparison between your results with a previous very similar study. The author didn`t perform the comparison because they could not find physical positions of SNP in that paper. Is that true? I am very sure there was physical position for each SNP. Also, the authors also provided physical position for each candidate region in that study.

I found the quality of the study is far above the quality of writing in this paper. Frankly, your work could be published some other high-reputational journals. The author should carefully revise their manuscript.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Oct 28;16(10):e0259278. doi: 10.1371/journal.pone.0259278.r004

Author response to Decision Letter 1


5 Oct 2021

Response to Reviewers’ Comments:

Reviewer #1: All of the comments and suggestions that I made have now been addressed and clearly answered. The manuscript is now much improved after all comments and suggestions have been taken on board.

RE: Thank you very much for your positive comments.

Reviewer #2: In the last round of review, I asked 3 questions, but the author did not provide the answerers directly. The first question is about the calculation of parameters in the association model. Yes, the equation was provided in the M&M, however, how did you match your data to each parameter. For example, in the GWAS model, it is not right to use all SNPs for Q calculation, but all SNPs should be used for K calculation. So, how did you calculate A and K?

RE: Thank you very much for pointing out this issue again.

1) In this study, the clones were from a full-sib family. In theory, for any two clones, the coefficient of kinship (constituted to K matrix) is expected to be 0.25 (Loiselle et al. 1995; Lynch and Walsh 1998). This led to the relationship matrix (A=2K) with elements of ones on the diagonal and 0.5 elsewhere (Lynch and Walsh 1998; Bae et al. 2016). Here, it is unnecessary to calculate (in fact, to estimate) the matrix K with SNP data because the pedigree is fully known and each value in the K matrix is equal to 0.25 or 0.5. We think we have described this issue clearly and cited related literatures in Lines 152-159.

Certainly, in most GWAS studies where the natural population were used and the pedigrees were usually unknown, it need to estimate the kinship matrix with SNP data. However, from a statistical point of view, theoretical values are usually better than estimators because the accuracy of estimators depends not only on the completeness of sampling data but also on the estimate method.

2) Population structure (Q) was originally used in human GWAS for overcoming genomic inflation because the population was considered natural and more complicated. However, the Q method is not a “golden rule” for all GWAS, and sometimes, it failed to control genomic inflation (Kim 2019; https://doi.org/10.1101/647768). In the current study, we established the multivariate linear mixed model for GWAS of leaf shape according to the traditional randomized complete block design. Through adjusting the dimension of the regular polar radii data, we found that a moderate dimensional data can be used to successfully control genomic inflation and obtain a better result in finding significant SNPs associated to leaf shape. We think that our mvLMM for GWAS was successfully established and every parameter was described clearly. Moreover, the whole calculation for parameter estimates and significant hypothesis test was implemented with the R package EMMREML (https://cran.r-project.org/web/packages/EMMREML).

Bae, H., S. Monti, M. Montano, M.H. Steinberg, T.T. Perls et al., 2016 Learning Bayesian networks from correlated data. Sci Rep 6: 25156. https://doi.org/10.1038/srep25156

Loiselle, B.A., V.L. Sork, J. Nason, and C. Graham, 1995 Spatial genetic structure of a tropical understory shrub, Psychotria officinalis (Rubiaceae). American Journal of Botany 82: 1420-1425. https://doi.org/10.1002/j.1537-2197.1995.tb12679.x

Lynch, M., and B. Walsh, 1998 Genetics and Analysis of Quantitative Traits. Sunderland, MA, USA: Sinauer Associates, Inc.

The second question is for determination of candidate genes. Frankly, I don`t think 100 genes around the significant SNP were a right selection. In different regions of genome, the density of genes was different. In linkage-based QTL mapping, 95% confident interval was used, while LD length was used in GWAS. So, I think there should be also a strategy for you to define the candidate QTL region in the model you used. Yes, different strategies, including 100 genes, were used in some published paper, but this doesn`t mean all of them were reasonable.

RE: Thank you for pointing out this issue again. As you suggested, we modified the investigation of candidate genes with the method of LD analysis. But, we found the average LD length was estimated to be ~650 bp in our previous study (Chen et al., 2021; https://www.doi.org/10.1093/g3journal/jkaa053). It is so short and cannot be properly used as a range for candidate genes. Therefore, we considered genes within a LD block that contained a significant SNP for investigating candidate gene. The corresponding sections in M&M and Results as well as Table 3 and S10 Table were thoroughly modified.

The third question is for comparison between your results with a previous very similar study. The author didn`t perform the comparison because they could not find physical positions of SNP in that paper. Is that true? I am very sure there was physical position for each SNP. Also, the authors also provided physical position for each candidate region in that study.

RE: Thank you for pointing out this issue again and sorry for not carefully considering it in the previous manuscript. In fact, the accurate QTL position was described in genetic distance (cM), but not available in physical distance (bp). However, their physical position can be determined by flanking SNPs. Therefore, we performed the comparison by adding the work as you suggested. Please see the result in in S11 Table. Accordingly, the last paragraph in Discussion was modified.

I found the quality of the study is far above the quality of writing in this paper. Frankly, your work could be published some other high-reputational journals. The author should carefully revise their manuscript.

RE: Thank you very much for this positive comment. We carefully revised those paragraphs for investigating candidate genes and the comparison study with previous similar works.

Attachment

Submitted filename: Response to Reviewers 20211005.docx

Decision Letter 2

Karthikeyan Adhimoolam

18 Oct 2021

Multivariate Genome-Wide Association Study of Leaf Shape in a Populus deltoides and P. simonii F1 Pedigree

PONE-D-21-14578R2

Dear Dr. Chunfa Tong,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Karthikeyan Adhimoolam

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #2: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Acceptance letter

Karthikeyan Adhimoolam

20 Oct 2021

PONE-D-21-14578R2

Multivariate Genome-Wide Association Study of Leaf Shape in a Populus deltoides and P. simonii F1 Pedigree

Dear Dr. Tong:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Karthikeyan Adhimoolam

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig

    Histograms with probability density curves (red) of normal distributions for each univariate trait of L (A), W (B), W31 (C), W21 (D), W32 (E), A (F), and the ratio of L to W (G) in the randomized complete block design derived from the F1 progeny of Populus deltoides × Populus simonii.

    (DOCX)

    S2 Fig

    Manhattan plots of the association analyses without genomic control for each univariate trait of L (A), W (B), W31 (C), W21 (D), W32 (E), A (F), and the ratio of L to W (G) across the 19 chromosomes of the reference genome of P. trichocarpa. The horizontal dashed line indicates the genome-wide significant threshold of 5.33, a base 10 logarithm of p-value based on the Bonferroni correction at the 0.05 significant level.

    (DOCX)

    S1 Table. The RADseq data information for the 2 parents and 163 progeny in the F1 hybrid population of Populus deltoides and Populus simonii.

    (DOCX)

    S2 Table. The raw data of leaf length, different widths and area from a randomized complete block design in Populus.

    (XLSX)

    S3 Table. Relative difference between leaf measurements with the two software ImageJ and LeafShape.

    (XLSX)

    S4 Table. Correlation coefficients among the leaf traits of L, W, W31, W21, W32, A, and the ratio of L to W in the randomized complete block design derived from the F1 progeny of Populus deltoides × Populus simonii.

    (DOCX)

    S5 Table. Analysis of variance for the leaf parameters of L, W, W1/3, W1/2, W2/3, and Area in the randomized complete block experiment derived from the F1 progeny of Populus deltoides × Populus simonii.

    (DOCX)

    S6 Table. Canonical correlation coefficients among the leaf length, widths, area, the length/width ratio, and polar radii in the randomized complete block design derived from the F1 progeny of Populus deltoides × Populus simonii.

    (DOCX)

    S7 Table. Correlation coefficients between the first principal component of different radius datasets and the leaf length, different widths, or area in the randomized complete block design derived from the F1 progeny of Populus deltoides × Populus simonii.

    (DOCX)

    S8 Table. Summary of significant SNPs associated to each trait of the four different leaf widths and area without genomic control.

    (DOCX)

    S9 Table. Summary of significant SNPs associated to the ratio of the leaf length to the maximum width without genomic control.

    (DOCX)

    S10 Table. The annotation of candidate genes for the significant SNPs with the non-redundant protein database at NCBI and GO database.

    (XLSX)

    S11 Table. Consistency between significant SNPs for poplar leaf shape identified in the current and previous studies.

    The distance of two close SNPs between two different studies is presented in brackets.

    (XLSX)

    Attachment

    Submitted filename: Response to Reviewers.docx

    Attachment

    Submitted filename: Response to Reviewers 20211005.docx

    Data Availability Statement

    The RADseq data is available in the SRA database at NCBI (http://www.ncbi.nlm.nih.gov/Traces/sra) with the accession numbers listed in S1 Table. Other relevant data are within the paper and its Supporting Information files.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES