Skip to main content
The Plant Cell logoLink to The Plant Cell
. 2015 Aug 11;27(8):2119–2132. doi: 10.1105/tpc.15.00328

Conserved Gene Expression Programs in Developing Roots from Diverse Plants

Ling Huang 1, John Schiefelbein 1,1
PMCID: PMC4568505  PMID: 26265761

Gene expression maps from the roots of seven plant species show a surprising degree of conservation in the genes and expression patterns employed during root development.

Abstract

The molecular basis for the origin and diversification of morphological adaptations is a central issue in evolutionary developmental biology. Here, we defined temporal transcript accumulation in developing roots from seven vascular plants, permitting a genome-wide comparative analysis of the molecular programs used by a single organ across diverse species. The resulting gene expression maps uncover significant similarity in the genes employed in roots and their developmental expression profiles. The detailed analysis of a subset of 133 genes known to be associated with root development in Arabidopsis thaliana indicates that most of these are used in all plant species. Strikingly, this was also true for root development in a lycophyte (Selaginella moellendorffii), which forms morphologically different roots and is thought to have evolved roots independently. Thus, despite vast differences in size and anatomy of roots from diverse plants, the basic molecular mechanisms employed during root formation appear to be conserved. This suggests that roots evolved in the two major vascular plant lineages either by parallel recruitment of largely the same developmental program or by elaboration of an existing root program in the common ancestor of vascular plants.

INTRODUCTION

The establishment of plants on land over 400 million years ago represented a critical stage in the history of life on Earth (Kenrick and Crane, 1997; Raven and Edwards, 2001; Gensel, 2008; Doyle, 2013). This transition was associated with numerous physiological and developmental innovations in plants, including in some lineages, the evolution of an exploratory multicellular subterranean organ (the root) suited for effective water and nutrient acquisition and plant anchorage. Considering the fossil record and root morphology in extant plants, it is generally accepted that roots evolved independently on more than one occasion during vascular plant evolution (Kenrick and Crane, 1997; Raven and Edwards, 2001; Friedman et al., 2004; Kenrick and Strullu-Derrien, 2014).

Roots from extant plant species vary widely in size, cellular anatomy, and physiological properties (Figure 1) (Esau, 1965; Rost, 2011; Seago and Fernando, 2013). Furthermore, the overall architecture of a root system can vary (e.g., tap root versus fibrous root system) and typically includes different root types (e.g., primary, lateral, and adventitious roots). Nevertheless, these roots share certain fundamental features, including a set of terminal protective cells (the root cap), a self-sustaining stem cell population (the root apical meristem), a radial organization of basic tissue types (from outermost epidermis tissue to innermost vascular tissue), the ability to form branch roots, the capacity to acquire and transport water and nutrients, and a tendency to grow in a downward direction (positive gravitropism). Furthermore, a common feature of root development is the spatial separation of the major cellular activities in distinct zones along the longitudinal axis at the root tip, typically including the meristematic zone (MZ; cell division, cellular patterning, and root cap formation), elongation zone (EZ; cell expansion), and differentiation zone (DZ; cell maturation) (Figure 1).

Figure 1.

Figure 1.

Root Development in Land Plant Species.

(A) Primary seedling roots of Arabidopsis (4 d), rice (4 d), S. moellendorffii (grown from bulbils, 21 d), tomato (6 d), cucumber (4 d), maize (4 d), and soybean (4 d). M, meristematic; E, elongation; D, differentiation; ED, overlapping elongation and differentiation. Bars = 0.1 mm.

(B) Transverse section of primary seedling roots of each species. Pseudo-color green, epidermis; red, cortex; blue, endodermis + pericycle + vascular tissue; yellow, exodermis + sclerenchymatous layer. Bars = 0.1 mm.

The molecular genetic basis for root development has been studied intensively in Arabidopsis thaliana and a large collection of genes involved in the patterning, growth, cellular differentiation, and maintenance of roots has been identified in this species (Bennett and Scheres, 2010; Petricka et al., 2012). Furthermore, global gene expression patterns have been defined for specific cells, tissues, and developmental stages of Arabidopsis roots, primarily using microarray-based methods (Galbraith and Birnbaum, 2006; Brady et al., 2007). Molecular studies of root development have also been conducted in other plants (Hochholdinger and Tuberosa, 2009; Jansen et al., 2013; Qiao and Libault, 2013; Karve and Iyer-Pascuzzi, 2015), although the extent to which the molecular mechanisms identified in Arabidopsis roots apply to other plants is not clear.

Here, we describe a broad molecular analysis of gene expression in developing roots of vascular plants, focusing on the tips of early-stage roots prior to branching. This was facilitated by the similar developmental zonation in roots and the availability of genome sequence information from diverse plant species, enabling a detailed comparative analysis of temporal gene expression patterns during root formation in seven plant species. The resulting gene expression maps indicate that, despite considerable variation in the size and cellular anatomy of roots from different species, these roots share a common developmental program. These data provide a foundation for the use of the plant root as a model for exploring the conservation and diversification of molecular mechanisms during plant organ evolution.

RESULTS

Gene Expression in Root Development Zones of Arabidopsis

We first analyzed gene expression in developing Arabidopsis primary roots by sequencing mRNA from longitudinal sections of the MZ, EZ, and DZ (Figure 1A; see Methods). This yielded a total of 21,037 root-expressed genes (mean fragments per kilobase per million mapped reads [FPKM] ≥ 0.5 in MZ, EZ, or DZ; expression detected in at least two out of three biological replicates), exhibiting diverse transcript accumulation patterns in the three zones (see sequence submission information; Supplemental Data Set 1). As validation, we surveyed the literature and found our expression data matches each of 19 genes’ transcript accumulation profiles previously determined by in situ RNA hybridization (Supplemental Table 1) (Birnbaum et al., 2005). More broadly, our RNA-Seq-based gene expression values positively correlate with previous microarray-based expression values (Brady et al., 2007) obtained from equivalent Arabidopsis developmental zones (r values: 0.72 [MZ], 0.65 [EZ], and 0.7 [DZ]; see Methods for details), although as expected, our RNA-Seq data exhibit substantially greater dynamic range, permitting greater accuracy for extreme expression values (Figure 2).

Figure 2.

Figure 2.

Comparison of RNA-Seq Results to Published Microarray Data on Gene Expression of Samples from MZ, EZ, and DZ.

MZ (A), EZ (B), and DZ (C). RNA-Seq FPKM values were averaged from three independent biological replicates. Microarray data were averaged from two independent biological replicates. One was added to all values prior to log2 transformation. r = Pearson’s correlation coefficient.

Using previously defined Arabidopsis gene family assignments (GreenPhyl v4; Rouard et al., 2011), we discovered that the root-expressed Arabidopsis genes are not randomly distributed, but tend to cluster among families (P < 0.001, χ2 test; Supplemental Data Sets 2 and 3). Similarly, we observed a nonrandom (clustering) distribution among families for those genes exhibiting preferential root zone expression (≥2.0 fold change [FC]; false discovery rate [FDR] ≤ 0.05) in the MZ, the EZ, or the DZ (P < 0.001 for each gene set, χ2 test; Supplemental Data Sets 3 and 4). The tendency for related Arabidopsis genes to possess similar root expression characteristics implies that root expression patterns tend to be conserved in gene lineages.

Root Gene Expression Is Conserved across Angiosperms

To analyze root developmental gene expression across angiosperms, we obtained MZ, EZ, and DZ transcriptomes (three biological replicates from each zone) from the primary roots of five additional angiosperm species: three eudicots (tomato [Solanum lycopersicum], soybean [Glycine max], and cucumber [Cucumis sativus]) and two monocots (rice [Oryza sativa] and maize [Zea mays]) (Figures 1 and 3A). Principal component analysis (PCA) shows that the major variation in transcript accumulation across these samples is explained by differential expression in different development zones (PC1 + PC2 accounted for >78% total variation). This pattern of developmental zone-driven gene expression variation is highly correlated across all six of the angiosperms tested (Figure 3B).

Figure 3.

Figure 3.

Gene Expression Preferences in Three Development Zones across Six Angiosperm Species.

(A) Ratio of genes expressed (average FPKM ≥ 0.5, at least two replicates have expression) in the root development zones across species. At, Arabidopsis; Cs, cucumber; Gm, soybean; Os, rice; Sl, tomato; Zm, maize; M, meristematic; E, elongation; D, differentiation.

(B) Merged PCA of gene expression in the root development zones from six angiosperm species samples. PCA was performed on all individual biological replicates from the same species and then plotted to the same figure.

To further investigate whether patterns of primary root gene expression are conserved among these species, we examined the distribution of root-expressed genes across angiosperm gene families. Among the 6613 gene families that possess at least one gene from each of these six angiosperms (Rouard et al., 2011), we discovered a nonrandom distribution of root-expressed genes (P < 0.001; χ2 test), with root-expressed genes from different species tending to associate in families (Supplemental Data Sets 2 and 3). This suggests that gene families devoted to root functions have been conserved during angiosperm evolution.

Next, we evaluated the possibility that related genes from different species possess similar temporal root expression profiles across the three developmental zones. Using a fuzzy C-means clustering algorithm, root-expressed genes exhibiting variation in transcript accumulation across the MZ, EZ, and DZ (≥2.0 FC between any two zones) from each of the six species were assigned to one of nine dominant gene expression profile types (Figure 4A; Supplemental Data Set 5). Pairs of species were then compared to determine whether matching profile types were preferentially observed for genes in the same families (using gene families possessing exactly one or two genes from each of the six species; see Methods). Indeed, the most statistically significant familial associations were found for genes with the same expression profile type (Figure 4B). Specifically, among the 135 pairwise interspecies comparisons for the same profile type, 132 of them exhibited significant familial association (corrected P < 0.01, Fisher’s exact test; Figure 4B). Furthermore, we found that the average ratio of overlapping expression types in a given family between species mirrored their known phylogenetic relationships (Figure 4C; see Methods). These results suggest conservation of the regulation of gene expression during primary root development in angiosperms.

Figure 4.

Figure 4.

Comparison of Root Gene Expression across Angiosperms.

(A) Nine expression profile types in the MZ, EZ, and DZ assigned by fuzzy C-Means clustering. Gene expression with at least 2 FC between zones was standardized to have mean of 0 and sd of 1. High affinity to the cluster centroid is shown in purple and low in green.

(B) Heat map of corrected one-tail P values of pairwise Fisher's exact test for association of expression profile types (1 to 9) within gene families for each of the six species.

(C) Hierarchical clustering dendrogram based on the average dissimilarity ratio of gene families not sharing the same expression profile types.

(D) Heat map of corrected one-tail P values of pairwise Fisher's exact test for association of expression profile types (1 to 9) within supergene families for each of the six species.

(E) Density plot of family total connectivity, which summed the CNVs of the supergenes of each species within given families. Red, 6161 GreenPhyl-defined families with root-expressed supergenes from all six angiosperms; green, 73 GreenPhyl-defined families containing Arabidopsis known key root development genes; blue, 1000 gene families with randomly assigned member genes.

(F) Example of one of the 71 maximum likelihood phylogenetic trees reconstructed for relatives of Arabidopsis key root development genes. Tree reconstructed for Arabidopsis (At) CRN (shaded) and its relatives from rice (Os), maize (Zm), cucumber (Cs), tomato (Sl), and soybean (Gm). Also included is a heat map indicating gene expression in the three development zones (M, meristematic zone; E, elongation zone; D, differentiation zone) (values indicate FPKM), the gene expression profile types (Type), and CNVs. The well-defined CRN clade with family members from all of the angiosperm species is shaded in light blue.

A difficulty in accurately comparing gene expression between species is the variation in gene number per species within families. To address this and enable an aggregate comparison of gene expression profiles in multigene families, we generated “supergenes” for each species, by summing root transcript accumulation values (separately for the MZ, EZ, and DZ) for all genes from the same species within a given family (Supplemental Data Set 6). These supergenes were clustered into nine expression profile types, similar to above, and using pairwise species comparisons, we again observed statistically significant familial association of supergenes bearing the same expression type (Figure 4D).

Next, we generated connectivity values (CNVs) for the supergenes by analyzing the correlation in expression profile between each supergene and the supergenes in its family (CNV ranges from −1 to 1; see Methods). The CNVs for all six supergenes in a given family were summed (for the 6161 families that contain root-expressed supergenes from each of the six angiosperms), as an estimate for expression profile similarity across species’ genes in the family. Strikingly, a large fraction of these 6161 families exhibit very high summed CNVs, relative to families constructed by random assignment of genes (Figure 4E; Supplemental Data Set 7). This provides additional evidence for the conservation of root gene expression profiles at the family level across all angiosperm species.

To further study those families exhibiting the greatest conservation in root gene expression, we selected families with summed supergene CNV ≥ 5.9 (10.5% of total families). After summing the standardized supergene expression data in each of these families, the resulting “family expression profiles” were clustered into nine expression types, similar to above, and Gene Ontology (GO) term enrichment analysis was conducted for each group. Interestingly, distinct sets of significantly enriched terms were obtained from the different expression profile types (Supplemental Table 2). For example, the largest proportion of highly significant GO terms from cluster type 7 (high expression in DZ, relative to MZ and EZ) was related to transcriptional regulation (Supplemental Table 2). Considering their high level of expression profile conservation across angiosperms, these families are likely to include previously unidentified genes important for angiosperm root function.

To focus on specific genes likely involved in root development, we used available root gene information from Arabidopsis. We identified 133 Arabidopsis genes previously reported to have a function in primary root development, comprising 71 families, encoding transcription factors and other putative regulatory proteins (Supplemental Table 3) (Petricka et al., 2012). Phylogenetic trees were constructed that contain these Arabidopsis genes and their relatives from all six angiosperm species (Figure 4F; Supplemental Figure 1). In addition, our root gene expression data was mapped onto these trees, and CNVs were calculated for the genes in each clade, enabling an assessment of sequence and expression relationships. Among these 71 families, 67 of them yielded well-supported clades containing the known Arabidopsis root development gene(s) together with a root-expressed gene(s) from each of the other five angiosperm species (Supplemental Figure 1). Among the remaining families, one was eudicot specific and three lacked a related gene in one angiosperm species. Furthermore, the genes in these families generally exhibited conservation in root expression profiles, as demonstrated by their high family CNV values, compared with randomly constructed families (Figure 4E). Together, the results suggest that regulators of primary root development genetically defined in Arabidopsis are employed similarly by other angiosperms.

Analysis of Root Gene Expression in the Lycophyte Selaginella moellendorffii

Roots are found in two major clades of extant vascular plants: euphyllophytes (including angiosperms, gymnosperms, and monilophytes [ferns]) and lycophytes (a non-seed plant clade that diverged from the euphyllophyte lineage ∼400 million years ago; Banks, 2009). Therefore, to compare root development more broadly, we analyzed root gene expression in a sequenced lycophyte, S. moellendorffii (Banks et al., 2011). We defined the transcriptomes of S. moellendorffii roots (produced from rhizophores via bulbils) from the MZ and the combined EZ + DZ (EDZ) (necessary due to the superimposition of EZ and DZ characters in the S. moellendorffii root; Figure 1A) for three biological replicates each (Figures 5A and 5B; Supplemental Data Set 8). Among the 5465 gene families containing at least one gene from S. moellendorffii and each of the six angiosperms (defined by GreenPhyl; Rouard et al., 2011), we discovered a significant association by family for root-expressed genes in S. moellendorffii and root-expressed genes in angiosperms (P < 0.001; Fisher’s exact test; Supplemental Data Sets 2 and 3). Specifically, 81.6% of families that contained a root-expressed gene from each of the six angiosperms also contained a root-expressed S. moellendorffii gene.

Figure 5.

Figure 5.

Comparison of Root Gene Expression across Seven Vascular Plants.

(A) Ratio of genes expressed (average FPKM ≥ 0.5, at least two replicates have expression) in the root development zones across species. Sm, S. moellendorffii; ED, overlapping elongation and differentiation zone.

(B) Merged PCA of gene expression in the root development zones from seven angiosperm species samples. PCA was performed on all individual biological replicates from the same species and then plotted to the same figure.

(C) Heat map of corrected one-tail P values of pairwise Fisher's exact test for association of expression profile types (1 to 5) assigned by EDZ/MZ FC within gene families for each of the seven species.

(D) Heat map of corrected one-tail P values of pairwise Fisher's exact test for association of expression profile types (1 to 5) assigned by EDZ/MZ FC within supergene families for each of the seven species.

(E) Correlation matrix of supergene expression log2 EDZ/MZ FC in 5027 GreenPhyl-defined families. Heat map was reordered according to the hierarchical clustering result.

(F) Scatterplot comparing supergene expression log2 EDZ/MZ FC between the average of all angiosperms versus S. moellendorffii for 5027 GreenPhyl-defined families. Colors indicate family total connectivity values (orange, low; purple, high) (see also Figure 2E). Angiosperm average was calculated as the mean of the eudicot average and the monocot average, giving equal weight to the two clades. Regression F-statistic P < 0.001.

We next compared root gene expression profiles from the six angiosperms and S. moellendorffii by converting the three-zone angiosperm transcript data to two-zone (by combining the EZ and DZ expression values; ≥0.5 FPKM in at least one zone, expression detected in at least two out of three biological replicates) (Supplemental Data Set 8). We then assigned each gene to one of five expression profile types, based on expression fold change between the MZ and EDZ (FPKM EDZ/FPKM MZ; type 1, FC ≥ 3; type 2, 1.5 < FC < 3; type 3, 0.67 ≤ FC ≤ 1.5; type 4, 0.33 ≤ FC ≤ 0.67; type 5, FC ≤ 0.33; Supplemental Data Set 9). Using pairwise species comparisons to assess the frequency of matching expression patterns within families, we discovered significant familial association when the same expression types were compared between species (Fisher’s exact test; Figure 5C), indicating that gene expression profiles are generally conserved within families containing S. moellendorffii and angiosperm genes.

To compare the overall degree of similarity in expression profiles in these species, we generated supergenes for each species’ genes in a given family (by summing expression values by zone) and compared supergene expression FC between the MZ and EDZ in S. moellendorffii and the six angiosperms by family (Supplemental Data Set 10). We assigned each of the supergenes to one of the five expression profile types based on their expression FC and, using pairwise species comparisons, we again observed a significant familial association of genes from different species exhibiting the same profile type (Fisher’s exact test; Figure 5D; Supplemental Data Set 10). We also discovered an overall positive correlation between gene expression profiles in S. moellendorffii and each angiosperm (r values: 0.49 to 0.57), albeit lower than for intra-angiosperm comparisons, consistent with the greater evolutionary divergence between S. moellendorffii and the angiosperms (Figure 5E). Next, we calculated the average EDZ/MZ expression for supergenes from each of the six angiosperms (in a weighted manner; see Methods) within a given family to generate a combined angiosperm EDZ/MZ expression value that was compared with the corresponding S. moellendorffii supergene’s value from the same family. We found a strong correlation in these EDZ/MZ values (F-statistic P < 0.001; Figure 5F), providing further evidence for family-dependent similarity in root gene expression profiles between S. moellendorffii and angiosperm genes. Interestingly, mapping angiosperm family CNV onto these results shows that supergenes with extreme expression FC values tend to exist in high-CNV families (Figure 5F), implying that the most conserved expression patterns are the ones that exhibit the greatest difference between these developmental zones.

To assess relationships among specific genes likely to encode root regulators, we analyzed S. moellendorffii genes related to the 133 known Arabidopsis root development genes (Supplemental Table 3). Phylogenetic trees were constructed that contain these 71 families of Arabidopsis genes and related genes from rice (as a representative monocot), S. moellendorffii, Norway spruce (Picea abies, a gymnosperm), and the moss Physcomitrella patens (a nonvascular, rootless plant), and we mapped our root expression data and calculated gene CNVs for the resulting clades (Supplemental Figure 2). Strikingly, we discovered that 67 of the 71 well-supported clades containing the Arabidopsis root genes also possessed a root-expressed S. moellendorffii gene(s) (Figures 6 and 7; Supplemental Figure 2). Furthermore, in 53 of these 67 clades, at least one S. moellendorffii gene matched the expression profile type of the Arabidopsis gene(s). These results indicate that S. moellendorffii largely possesses and expresses the same genes known to be critical for root development in angiosperms. In addition, we found that 70 of these 71 clades possessed a related gene from Norway spruce (Supplemental Figure 2), suggesting that conservation of the developmental gene program extends to gymnosperms, although we do not have root expression data to fully support this suggestion.

Figure 6.

Figure 6.

Maximum Likelihood Tree Constructed for Relatives of Arabidopsis GNOM.

The gene expression level from the two development zones (M, meristematic zone; ED, combined elongation + differentiation zone) are shown as heat maps (values indicate FPKM), together with the expression FC type. The closely related Arabidopsis gene EDA10 was used as outgroup to root the tree. Expression FC types (1 to 5) for the genes are shown in different colors, whereas light gray indicates that no root expression was detected. The target gene (Arabidopsis GNOM) is shaded in gray, and the target clade with family members from all species is shaded in light blue. Related genes are also included from two species (Pa, Norway spruce; Pp, P. patens), which did not have their root expression assessed, indicated by blank spaces in the heat maps.

Figure 7.

Figure 7.

Maximum Likelihood Tree Constructed for Relatives of Arabidopsis FEZ.

The gene expression level from two development zones (M and ED) are shown as heat maps, together with the expression FC type. The closely related Arabidopsis gene ANAC056 is used as outgroup to root the tree. Expression FC types (1 to 5) for the genes are shown in different colors, whereas light gray indicates that no root expression was detected. The target gene (Arabidopsis FEZ) is shaded in gray and the target clade with family members from all vascular plant species is shaded in light blue. Related genes are also included from two species (Pa, Norway spruce; Pp, P. patens), which did not have their root expression assessed, indicated by blank spaces in the heat maps.

Among these 71 clades, two of them possess Arabidopsis genes regulating root cap formation. The root cap is believed to be a root-specific innovation with no shoot counterpart (Barlow, 2002; Bennett and Scheres, 2010). The FEZ gene of Arabidopsis promotes root cap stem cell activity (Willemsen et al., 2008), and, interestingly, it is part of a vascular plant specific clade with representatives sharing preferential MZ expression (Figure 7). The SOMBRERO, BEARSKIN1, and BEARSKIN2 genes participate in root cap maturation in Arabidopsis (Bennett et al., 2010) and are included in a large clade with a preferential MZ-expressed gene(s) from each vascular plant species (Supplemental Figures 1 and 2). These results are consistent with the possibility that root cap-associated gene functions are shared in vascular plants.

Lastly, we considered the possibility that the similarity in root gene expression among these species is due to a general molecular program acting in all developing organs. To examine this, we compared our root transcriptome data sets to an available transcriptome data set from the shoot inflorescence meristem of Arabidopsis (Mantegazza et al., 2014). We identified Arabidopsis gene families that lack any shoot meristem-expressed genes and then analyzed the distribution of root-expressed genes from the seven species within these families (Supplemental Data Set 3). We discovered a statistically significant association of root-expressed genes by family for each pairwise comparison of Arabidopsis and each of the other species (corrected P < 0.001 for each comparison, Fisher’s exact test), implying conserved root expression characteristics for these families of genes that are not expressed in all developing organs. For example, 96% of the families (102/106) possessing a S. moellendorffii root-expressed gene (and lacking an Arabidopsis shoot meristem-expressed gene) also possessed a root-expressed gene from at least one of the angiosperm species. Although limited by its use of a single non-root developing organ, this analysis indicates that the similarity in root gene expression we observed among these species is not likely to be due solely to a general molecular program shared by all developing plant organs.

DISCUSSION

In this study, we defined gene expression maps from the developing roots of seven different plant species, enabling a comprehensive comparative analysis of the molecular genetic control of a developing organ type in plants. The most general finding is that, despite the vastly different sizes and cellular structures of roots from these seven species (Figure 1), there is substantial conservation in the usage and expression of their genes in root development. Regarding conservation in gene usage, we discovered a statistically significant degree of overlap in the gene families containing root-expressed members from the various species, indicating that related genes are used for root development across all species. Regarding conservation in gene expression, we found significant similarities in the expression profiles from the root developmental zones for related genes in the same family across different species. These findings were observed both in a genome-wide analysis of all root-expressed gene families as well as in the specific analysis of 71 families containing 133 Arabidopsis genes encoding known root developmental regulators. These results suggest a common molecular program, employing similar genes and gene regulation, is used in developing roots across vascular plants.

These findings provide insight into the history of gene innovation/recruitment during root evolution in vascular plants (Figure 8). A large number of gene families (6004), including a large fraction of the known Arabidopsis root developmental gene families (67/71), contain root-expressed genes from each of the seven vascular plant species, implying that representatives of these families were recruited to participate in root development at an early stage (Figure 8). Smaller numbers of root gene families possess root-expressed genes in specific plant subgroups (e.g., angiosperms, eudicots, and monocots) or from a single species, likely reflecting later gene gains and losses in distinct lineages. These lineage-specific gene families may be responsible, in part, for the differing root characteristics that exist in particular plant clades. Extending this study to include additional species with varying root architecture (e.g., fibrous versus tap root) and to include different root types (e.g., primary versus lateral versus adventitious) will likely link specific genes/families to particular root characters and provide a more complete picture of root evolution. We also note that some of these families possess a relatively large number of angiosperm genes, indicating substantial gene expansion in certain angiosperm lineages (as previously reported for other developmental gene families; Feller et al., 2011), which may also explain some of the variation in root phenotypes.

Figure 8.

Figure 8.

Evolutionary History of Root-Expressed Gene Families.

Depiction of the phylogenetic relationships of the species examined in this study and putative origin of their root-expressed gene families. Blue, GreenPhyl-defined gene families; red, gene families containing an Arabidopsis gene known to be associated with root development (reconstructed by maximum likelihood). Positive numbers refer to putative lineage-specific gain of families containing root-expressed genes; negative numbers refer to putative lineage-specific loss of families containing root-expressed genes. See text for discussion.

It is remarkable that the roots of S. moellendorffii and angiosperms appear to share a similar molecular developmental program because the lycophyte and euphyllophyte lineages of plants are generally thought to have evolved roots independently (Kenrick and Crane, 1997; Raven and Edwards, 2001). This view is supported by the available fossil evidence, which indicates that early euphyllophytes lacked roots at a time when lycophytes possessed them (Raven and Edwards, 2001; Friedman et al., 2004). Furthermore, lycophyte roots exhibit some unusual developmental features, including branching by bifurcation rather than the endogenous lateral root formation typical of euphyllophytes (Raven and Edwards, 2001; Banks, 2009).

We consider two general explanations for the similar root gene expression patterns in lycophytes and angiosperms: parallel recruitment of largely the same developmental program independently in the lycophyte and euphyllophyte lineages or the existence of a primitive root developmental program in their common ancestor. Regarding the possibility of parallel recruitment, strong selective pressures and a limited genetic “toolkit” may have restricted the evolutionary path for root formation in both lineages. In this vein, it is notable that roots and shoots of extant plants deploy many of the same (or closely related) developmental genes (Benfey, 1999; Stahl and Simon, 2010), likely due to their common origin from a primitive telomic axis (Kenrick and Crane, 1997; Gensel and Berry, 2001; Friedman et al., 2004; Ligrone et al., 2012; Tomescu et al., 2014). Similarly, roots that evolved independently in separate lineages might still be expected to share a substantial fraction of their developmental program due to recruitment from a largely common pool of organ development genes. To examine this issue rigorously, it will be necessary to define developmental transcriptomes, equivalent to the root developmental transcriptomes analyzed here, from multiple organs of lycophytes and angiosperms. Related to this, we note that the observed similarity in gene usage and expression at the family level reported here may overestimate the degree of functional similarity in these genes because individual genes within a given family may have undergone substantial functional diversification.

On the other hand, the possibility that a primitive root program existed prior to the divergence of lycophytes and euphyllophytes is also consistent with the substantial similarity in root gene expression profiles and, in particular, the gene families associated with root cap formation. The root cap was a unique evolutionary innovation, not present in the shoot (or in the presumed telomic axis precursor), so genes for its specification and formation would be expected to be distinct if roots evolved independently. Indeed, a detailed molecular dissection of the Arabidopsis root meristem has led to the proposal that the root cap, and its associated meristematic cells, is a structure that evolved separately from the major portion of the root, representing a later innovation that enabled the root to more effectively penetrate soil and generate modern-day “true roots” (Bennett and Scheres, 2010). In addition, the strength of the fossil evidence supporting independent root evolution has been called into question, due to its incomplete nature and the poor preservation of fossilized roots, leading some to consider the origin of roots an unsettled issue (Gensel, 2008). Thus, it is conceivable that the common ancestor of lycophytes and euphyllophytes had already possessed a rudimentary root developmental program, perhaps generating a transitional “rooty structure” (Doyle, 2013) that was subsequently modified. It will be necessary to conduct detailed studies of individual root genes identified here (e.g., the root cap genes) to determine whether their similarity in sequence and expression across species is mirrored by similarity in developmental function.

The gene expression data sets described here represent a useful resource for future studies of root molecular biology. In addition to eliciting and testing evolutionary hypotheses, these data should assist in the identification of new root-expressed genes and functionally related homologs of previously defined root genes. In particular, the genes with strongly conserved root expression profiles across all the species described here are likely to include novel regulators of root development and function.

METHODS

Biological Material and RNA Isolation

Seeds of Arabidopsis thaliana (Columbia), tomato (Solanum lycopersicum Heinz 1706), soybean (Glycine max Williams 82), cucumber (Cucumis sativus Gy14), rice (Oryza sativa spp japonica cv Nipponbare), maize (Zea mays B73), and bulbils of Selaginella moellendorffii (Plant Delights) were germinated on agarose-solidified nutrient media under constant light as previously described (Schiefelbein and Somerville, 1990). The growing tips of the angiosperm seedling primary roots and rhizophore-derived S. moellendorffii roots were dissected (prior to branching) along the longitudinal axis using landmarks of cell length and root hair production. The MZ segment represented the terminal portion of the root, cut at the position where the length of cells began to exceed their width. The DZ segment included the first initiated root hair until the point where root hairs first reach their full length. The EZ segment represented the root portion between the MZ and DZ. Root sections were frozen immediately after collection, and total RNA was extracted from frozen samples using Qiagen RNeasy Plant Mini Kit. Library construction was performed by the University of Michigan Sequencing Core using the Illumina TruSeq Kit followed by sequencing on Illumina HiSequation 2000 System.

Microscopy Imaging

Plant seedlings were embedded in 3% agarose gel and sections from the DZ were obtained by hand-sectioning. Samples were stained with Fluorescent Brightener 28 (Sigma-Aldrich) for 5 to 20 s prior to examination with an Olympus IX81 microscope.

RNA-Seq Analysis

A total of 1.649 billion 50-bp single-end reads were generated from the 60 RNA samples (average of 27.5 million reads per sample). Reads were assessed by FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and the initial 15 bp of the 50-bp reads (containing low quality sequence information) were trimmed before further processing (Bolger et al., 2014). Raw reads were mapped to the corresponding reference genome by TopHat (version 2.0.3; Kim et al., 2013), embedded with Bowtie2 (version 2.0.0-beta7) (Langmead and Salzberg, 2012) and SAMtools (version 0.1.19) (Li et al., 2009)) with default settings (–segment length 17). The mapped reads were quantified by Cufflinks2 (version 2.1.1; Trapnell et al., 2013) with the correction for multiread on (-u -G). An updated TopHat version (2.0.9 with embedded Bowtie2 2.1.0) was used for combined region analysis using same setting as described above.

Cucumber (v122) and S. moellendorffii (v91) reference sequences were downloaded from Phytozome (http://www.phytozome.net/). The other five genome sequences were downloaded from Ensembl (v19, http://plants.ensembl.org/index.html).

S. moellendorffii gene expression values were all multiplied by 2 to correct for the duplication of the reference genome sequences.

Transcript sequence data from the Arabidopsis shoot inflorescence meristem were downloaded from http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-1946/samples/ (Mantegazza et al., 2014). For consistency in gene comparison, the downloaded raw data was remapped to the reference annotation used for this study using the same parameters listed above. The reprocessed data yielded a Spearman’s correlation coefficient of 0.94, relative to the previously published processed data (Mantegazza et al., 2014). A gene was deemed to be inflorescence meristem expressed if the mean FPKM ≥ 0.5 and the FPKM > 0 for each of the two replicates.

Gene Differential Expression Analysis

Raw counts of Arabidopsis were extracted from Cuffdiff2 (v2.1.1 with default setting plus correction for multiread and fragments bias [-b -u -N]; Trapnell et al., 2013) and analyzed together with the edgeR software package (Robinson et al., 2010; Zhang et al., 2014) to define gene sets preferentially expressed in zones. Briefly, genes with low expression were filtered out (counts per million should > 1 for at least three out of nine samples). Next, the raw counts were normalized using the upper quartile method, and the sample variation was estimated by tag-wise dispersion. The default trimmed mean of M-values method was not used because the assumption that most genes were not differentially expressed was violated. The resulting P values were corrected by Benjamini-Hochberg method (Benjamini and Hochberg, 1995) for multiple testing. Genes with a FC ≥ 2 as well as a FDR q-value ≤ 0.05 were considered to be zone preferentially expressed.

Comparisons between RNA-Seq and Published Microarray Data

Microarray data were downloaded from the available website (Brady et al., 2007) (http://www-plb.ucdavis.edu/labs/brady/software/BradySpatiotemporalData/). Expression values from zones 1 to 6, 7 and 8, and 9 to 12 for a given probe were combined to create the mean expression in meristematic, elongation, and differentiation zones, respectively. Next, the data from two independent biological replicates were averaged, and gene expression was measured as the mean across all its corresponding probes. RNA-Seq data were averaged from three independent biological replicates. A value of one was added to all data prior to log2 transformation. A total of 22,262 genes contained data from both data sets and were used for comparisons.

Gene Family Information

The composition of gene families in the seven plant species was obtained from GreenPhyl (v4; Rouard et al., 2011) and is presented in Supplemental Data Set 2.

Statistical Analyses

The distribution of root-expressed and non-root-expressed genes among families in Arabidopsis was analyzed by χ2 tests, controlling for gene family size (2, 3, 4, 5, or 6 Arabidopsis genes/family) and assuming random distribution. In addition, the distribution of genes preferentially expressed in the MZ, EZ, or DZ among Arabidopsis families and the distribution of root-expressed and non-root-expressed genes from the six angiosperm species among families were analyzed by χ2 tests. All statistical analysis and graph plotting was performed in the R environment (http://www.R-project.org) unless mentioned specifically. Figures were plotted using fmsb (Nakazawa, 2014), ggplot2 (Wickham, 2009), and gplots (http://crantastic.org/packages/gplots/versions/36507) packages in R.

PCA

Gene expression FPKM values of samples from the same species were all elevated by 1 and then transformed to log2 scale, mean centered for PCA by prcomp function in R. The first and second principal components were found to account for over 78% of the total variation in the data set. PC1 and PC2 from all species were plotted on the same figure.

Gene Expression Pattern Type Assignment

Root-expressed genes (defined as FPKM ≥ 0.5 in at least one zone and FPKM > 0 for at least two out of three biological replicates) with FC < 2 between all zones were assigned expression pattern type 0. The remaining root-expressed genes’ expression profiles were standardized such that mean = 0 and sd = 1, followed by fuzzy C-means clustering using R package Mfuzz (Kumar and E Futschik, 2007). After monitoring the minimum distances between cluster centroids, the number of groups was optimized as 9. Genes were assigned the pattern type (types 1 to 9) with which they exhibited the highest affinity value.

For the two-zone expression profile comparisons, five pattern types were assigned to the root-expressed genes. FC was calculated as EDZ FPKM/MZ FPKM (FC was set at 10 for genes with MZ FPKM = 0): type 1, FC ≥ 3; type 2, 1.5 < FC < 3; type 3, 0.67 ≤ FC ≤ 1.5; type 4, 0.33 < FC < 0.67; type 5, 0 ≤ FC ≤ 0.33.

Fisher’s Exact Test Analyses

To test for association between genes in different species in a given gene family by expression pattern type, a Fisher's exact test was performed. Given two species, A and B, and two expression profile types, ia and ib (i belongs to 1 to 9 for angiosperm or 1 to 5 for S. moellendorffii comparison), the number of GreenPhyl gene families that possess exactly one A gene and one B gene, one A gene and two B genes, or two A genes and one B gene were counted as the background total. For these three circumstances, overlapping gene families were those containing at least one gene from A with pattern ia and at least one gene from B with pattern ib. Nonoverlapping gene families were those containing at least one gene from A with pattern type ia, but no gene from B with pattern type ib and vice versa. When comparing types from the same species, only families possessing exactly two genes were considered. Overlapping gene families were those containing each of the types of interest. Nonoverlapping gene families were those containing only one type of interest. The Fisher’s test was performed in R. The resulted P values were corrected by Benjamini-Hochberg method (Benjamini and Hochberg, 1995) to avoid multiple testing errors.

Generation and Analyses of Supergenes

Supergene expression was obtained by summing the transcript expression values from all of genes from the same species within a given family. Expression profile types were then assigned by clustering using the Mfuzz program (Kumar and E Futschik, 2007) as above or by FC (described below). Comparisons within species for the same expression profile types were assigned a P value of 0, whereas comparisons within species for different expression profile types were assigned a P value of 1 because each species was only able to have one expression profile type within a given family in this analysis. Overlapping gene families were counted as ones possessing supergenes from two species with the same profile types. Nonoverlapping gene families were counted as the ones possessing supergenes with different profile types in the two species being considered. The total number of gene families possessing supergenes from two species was used as background.

For supergene expression comparisons across all seven species, the 5027 GreenPhyl families that possessed at least one root-expressed supergene from each of the seven species were analyzed. The EDZ/MZ gene expression FC was calculated for each species’ supergene within these 5027 families, and these were used to construct a matrix containing the pairwise Pearson’s correlation coefficients (Figure 3C). To create angiosperm expression profiles, the FC values from dicot/monocot supergenes were averaged, respectively. Next, the mean of the dicot and monocot FC values were used to represent the overall angiosperm expression profiles and compared with the corresponding S. moellendorffii supergene expression profiles from the same families.

Connectivity

CNV was calculated as the average pairwise Pearson's correlation coefficient of a given gene’s expression profile to all the genes’ profiles within a given family, similar to the reported methods (Koenig et al., 2013).

Hierarchical Clustering

For each assigned gene expression pattern type, the overlapping ratio of expression pattern types between two species was calculated as the number of families that possessed at least one gene from each species with a given expression pattern type divided by the number of families that possessed at least one gene from either species with a given expression pattern type. The dissimilarity was measured as 1 − overlapping ratio. The average dissimilarity across nine expression profile types was used for hierarchical clustering by the hclust function in R with default “complete” method.

GO Term Enrichment Test

GO term enrichment analysis was performed by DAVID (http://david.abcc.ncifcrf.gov/) with P value corrected by Benjamini-Hochberg FDR method (Benjamini and Hochberg, 1995). The analysis was performed on the Arabidopsis genes present in the gene families with CNVs > 5.9 (10.5% of total) that had been clustered into nine groups by the Mfuzz program (Kumar and E Futschik, 2007). The “gene family expression profile” in the three developmental zones was generated by summing the standardized expression values of each super gene within a given family. Significantly enriched terms with Benjamini-Hochberg (Benjamini and Hochberg, 1995) corrected P value ≤ 0.01 were found for five of these nine groups (3, 4, 5, 7, and 8).

Phylogenetic Analysis

BLAST (v2.2.26+; Camacho et al., 2009) was used to search for candidate homologous genes (e-value ≤ 1) in a given database composed of protein sequences from five or six species. Then, pairwise comparisons between all candidate genes were performed by a fast Smith-Waterman search (SWIPE version 2.0.7; Rognes, 2011) to cluster sequences into homolog groups by a connected component clustering method essentially as previously described (Bernardes et al., 2015) (Supplemental Methods 1). The assumption in the clustering approach was that orthologous sequences should be more closely related compared with nonorthologous sequences. After clustering, one longest protein sequence for each gene model was retained. Evolution models were tested by Modelgenerator (version 0.85; Keane et al., 2006). All related sequences were first aligned by MAFFT (version 6.864b; Katoh and Standley, 2013; with setting–genafpair–ep 0–maxiterate 1000), and the maximum likelihood tree was computed by RAxML (version 7.7.8; Stamatakis, 2006) with JTT model using empirical base frequencies, gamma distribution of rate heterogeneity, and 100 rapid bootstrap test (-m PROTGAMMAJTTF -f a -N 100) or, if the tree size exceeded 100, first alignment was conducted with FastTree (Price et al., 2009; default setting with –gamma option on). Next, sequences from a well-supported clade (defined as a monophyletic group including genes from all species, unless the closely related genes were included in another well-supported clade) with bootstrap ≥70 or FastTree local support ≥0.85 together with sequence from an outgroup (neighboring) clade or gene (a closely related Arabidopsis gene with the smallest BLAST e-value) were realigned by MAFFT with the same settings. Alignments were trimmed using trimAI (version 1.2rev59; Capella-Gutiérrez et al., 2009) with setting –automated1 or –gappyout (for WOX5 family tree only). Alignments are provided in Supplemental Data Sets 11 and 12, using the SeaView (v.4.3.3) platform (Gouy et al., 2010). The final trees were reconstructed by RAxML with 1000 rapid bootstrap (-m PROTGAMMAJTTF –f a –N 1000). Most trees were rooted between two well-supported clades. If no related well-supported neighbor clade could be identified, the tree was rooted by an Arabidopsis outgroup gene. If neither was available, the tree was rooted between monocots and eudicots (for the angiosperm species trees) or between lycophytes and euphyllophytes (for the vascular plant trees). Norway spruce protein sequences were downloaded from http://congenie.org/start (Nystedt et al., 2013).

Accession Numbers

Sequence data from this article can be found in the Gene Expression Omnibus under accession number GSE64665 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE64665).

Supplemental Data

Supplementary Material

Supplemental Data

Acknowledgments

We thank Richard McEachin and Ana Grant (University of Michigan) for RNA-Seq pipeline development and Ya Yang and Stephen Smith (University of Michigan) for assistance with phylogenetic analysis and pipeline development.

AUTHOR CONTRIBUTIONS

L.H. and J.S. designed the research, performed the experiments, analyzed the data, and wrote the article.

Glossary

MZ

meristematic zone

EZ

elongation zone

DZ

differentiation zone

FPKM

fragments per kilobase per million mapped reads

FDR

false discovery rate

PCA

principal component analysis

FC

fold change

CNV

connectivity value

GO

Gene Ontology

EDZ

combined EZ + DZ

References

  1. Banks J.A. (2009). Selaginella and 400 million years of separation. Annu. Rev. Plant Biol. 60: 223–238. [DOI] [PubMed] [Google Scholar]
  2. Banks J.A., et al. (2011). The Selaginella genome identifies genetic changes associated with the evolution of vascular plants. Science 332: 960–963. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Barlow P.W. (2002). The root cap: Cell dynamics, cell differentiation and cap function. J. Plant Growth Regul. 21: 261–286. [Google Scholar]
  4. Benfey P.N. (1999). Is the shoot a root with a view? Curr. Opin. Plant Biol. 2: 39–43. [DOI] [PubMed] [Google Scholar]
  5. Benjamini Y., Hochberg Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. 57: 289–300. [Google Scholar]
  6. Bennett T., Scheres B. (2010). Root development-two meristems for the price of one? Curr. Top. Dev. Biol. 91: 67–102. [DOI] [PubMed] [Google Scholar]
  7. Bennett T., van den Toorn A., Sanchez-Perez G.F., Campilho A., Willemsen V., Snel B., Scheres B. (2010). SOMBRERO, BEARSKIN1, and BEARSKIN2 regulate root cap maturation in Arabidopsis. Plant Cell 22: 640–654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bernardes J.S., Vieira F.R., Costa L.M., Zaverucha G. (2015). Evaluation and improvements of clustering algorithms for detecting remote homologous protein families. BMC Bioinformatics 16: 34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Birnbaum K., Jung J.W., Wang J.Y., Lambert G.M., Hirst J.A., Galbraith D.W., Benfey P.N. (2005). Cell type-specific expression profiling in plants via cell sorting of protoplasts from fluorescent reporter lines. Nat. Methods 2: 615–619. [DOI] [PubMed] [Google Scholar]
  10. Bolger A.M., Lohse M., Usadel B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30: 2114–2120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Brady S.M., Orlando D.A., Lee J.Y., Wang J.Y., Koch J., Dinneny J.R., Mace D., Ohler U., Benfey P.N. (2007). A high-resolution root spatiotemporal map reveals dominant expression patterns. Science 318: 801–806. [DOI] [PubMed] [Google Scholar]
  12. Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., Madden T.L. (2009). BLAST+: architecture and applications. BMC Bioinformatics 10: 421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Capella-Gutiérrez S., Silla-Martínez J.M., Gabaldón T. (2009). trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25: 1972–1973. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Doyle J. (2013). Phylogenetic analyses and morphological innovations in land plants. In Annual Plant Reviews: The Evolution of Plant Form, Vol. 45, B.A. Ambrose and M.D. Purugganan, eds (New York: Wiley-Blackwell; ), pp. 1–50. [Google Scholar]
  15. Esau K. (1965). Plant Anatomy. (New York: John Wiley & Sons; ). [Google Scholar]
  16. Feller A., Machemer K., Braun E.L., Grotewold E. (2011). Evolutionary and comparative analysis of MYB and bHLH plant transcription factors. Plant J. 66: 94–116. [DOI] [PubMed] [Google Scholar]
  17. Friedman W.E., Moore R.C., Purugganan M.D. (2004). The evolution of plant development. Am. J. Bot. 91: 1726–1741. [DOI] [PubMed] [Google Scholar]
  18. Galbraith D.W., Birnbaum K. (2006). Global studies of cell type-specific gene expression in plants. Annu. Rev. Plant Biol. 57: 451–475. [DOI] [PubMed] [Google Scholar]
  19. Gensel P.G. (2008). The Earliest Land Plants. Annu. Rev. Ecol. Evol. Syst. 39: 459–477. [Google Scholar]
  20. Gensel P.G., Berry C.M. (2001). Early lycophyte evolution. Am. Fern J. 91: 74–98. [Google Scholar]
  21. Gouy M., Guindon S., Gascuel O. (2010). SeaView version 4: A multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol. Biol. Evol. 27: 221–224. [DOI] [PubMed] [Google Scholar]
  22. Hochholdinger F., Tuberosa R. (2009). Genetic and genomic dissection of maize root development and architecture. Curr. Opin. Plant Biol. 12: 172–177. [DOI] [PubMed] [Google Scholar]
  23. Jansen L., Hollunder J., Roberts I., Forestan C., Fonteyne P., Van Quickenborne C., Zhen R.G., McKersie B., Parizot B., Beeckman T. (2013). Comparative transcriptomics as a tool for the identification of root branching genes in maize. Plant Biotechnol. J. 11: 1092–1102. [DOI] [PubMed] [Google Scholar]
  24. Karve R., Iyer-Pascuzzi A.S. (2015). Digging deeper: high-resolution genome-scale data yields new insights into root biology. Curr. Opin. Plant Biol. 24: 24–30. [DOI] [PubMed] [Google Scholar]
  25. Katoh K., Standley D.M. (2013). MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30: 772–780. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Keane T.M., Creevey C.J., Pentony M.M., Naughton T.J., Mclnerney J.O. (2006). Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evol. Biol. 6: 29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Kenrick P., Crane P.R. (1997). The origin and early evolution of plants on land. Nature 389: 33–39. [Google Scholar]
  28. Kenrick P., Strullu-Derrien C. (2014). The origin and early evolution of roots. Plant Physiol. 166: 570–580. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Kim D., Pertea G., Trapnell C., Pimentel H., Kelley R., Salzberg S.L. (2013). TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14: R36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Koenig D., et al. (2013). Comparative transcriptomics reveals patterns of selection in domesticated and wild tomato. Proc. Natl. Acad. Sci. USA 110: E2655–E2662. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kumar L., E Futschik M. (2007). Mfuzz: a software package for soft clustering of microarray data. Bioinformation 2: 5–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Langmead B., Salzberg S.L. (2012). Fast gapped-read alignment with Bowtie 2. Nat. Methods 9: 357–359. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R.; 1000 Genome Project Data Processing Subgroup (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25: 2078–2079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Ligrone R., Duckett J.G., Renzaglia K.S. (2012). Major transitions in the evolution of early land plants: a bryological perspective. Ann. Bot. (Lond.) 109: 851–871. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Mantegazza O., Gregis V., Chiara M., Selva C., Leo G., Horner D.S., Kater M.M. (2014). Gene coexpression patterns during early development of the native Arabidopsis reproductive meristem: novel candidate developmental regulators and patterns of functional redundancy. Plant J. 79: 861–877. [DOI] [PubMed] [Google Scholar]
  36. Nakazawa M. (2014). Practices of Medical and Health Data Analysis Using R. (Boston: Pearson Education; ). [Google Scholar]
  37. Nystedt B., et al. (2013). The Norway spruce genome sequence and conifer genome evolution. Nature 497: 579–584. [DOI] [PubMed] [Google Scholar]
  38. Petricka J.J., Winter C.M., Benfey P.N. (2012). Control of Arabidopsis root development. Annu. Rev. Plant Biol. 63: 563–590. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Price M.N., Dehal P.S., Arkin A.P. (2009). FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol. Biol. Evol. 26: 1641–1650. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Qiao Z., Libault M. (2013). Unleashing the potential of the root hair cell as a single plant cell type model in root systems biology. Front. Plant Sci. 4: 484. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Raven J.A., Edwards D. (2001). Roots: evolutionary origins and biogeochemical significance. J. Exp. Bot. 52: 381–401. [DOI] [PubMed] [Google Scholar]
  42. Robinson M.D., McCarthy D.J., Smyth G.K. (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26: 139–140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Rognes T. (2011). Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation. BMC Bioinformatics 12: 221. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Rost T.L. (2011). The organization of roots of dicotyledonous plants and the positions of control points. Ann. Bot. (Lond.) 107: 1213–1222. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Rouard M., Guignon V., Aluome C., Laporte M.A., Droc G., Walde C., Zmasek C.M., Périn C., Conte M.G. (2011). GreenPhylDB v2.0: comparative and functional genomics in plants. Nucleic Acids Res. 39: D1095–D1102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Schiefelbein J.W., Somerville C. (1990). Genetic control of root hair development in Arabidopsis thaliana. Plant Cell 2: 235–243. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Seago J.L. Jr., Fernando D.D. (2013). Anatomical aspects of angiosperm root evolution. Ann. Bot. (Lond.) 112: 223–238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Stahl Y., Simon R. (2010). Plant primary meristems: shared functions and regulatory mechanisms. Curr. Opin. Plant Biol. 13: 53–58. [DOI] [PubMed] [Google Scholar]
  49. Stamatakis A. (2006). RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22: 2688–2690. [DOI] [PubMed] [Google Scholar]
  50. Tomescu A.M.F., Wyatt S.E., Hasebe M., Rothwell G.W. (2014). Early evolution of the vascular plant body plan - the missing mechanisms. Curr. Opin. Plant Biol. 17: 126–136. [DOI] [PubMed] [Google Scholar]
  51. Trapnell C., Hendrickson D.G., Sauvageau M., Goff L., Rinn J.L., Pachter L. (2013). Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat. Biotechnol. 31: 46–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Wickham, H. (2009). ggplot2: Elegant Graphics for Data Analysis. (New York: Springer). [Google Scholar]
  53. Willemsen V., Bauch M., Bennett T., Campilho A., Wolkenfelt H., Xu J., Haseloff J., Scheres B. (2008). The NAC domain transcription factors FEZ and SOMBRERO control the orientation of cell division plane in Arabidopsis root stem cells. Dev. Cell 15: 913–922. [DOI] [PubMed] [Google Scholar]
  54. Zhang Z.H., Jhaveri D.J., Marshall V.M., Bauer D.C., Edson J., Narayanan R.K., Robinson G.J., Lundberg A.E., Bartlett P.F., Wray N.R., Zhao Q.Y. (2014). A comparative study of techniques for differential expression analysis on RNA-Seq data. PLoS One 9: e103207. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Data

Articles from The Plant Cell are provided here courtesy of Oxford University Press

RESOURCES