Skip to main content
Plant Direct logoLink to Plant Direct
. 2019 Jul 26;3(7):e00147. doi: 10.1002/pld3.147

Accelerating structure‐function mapping using the ViVa webtool to mine natural variation

Morgan O Hamm 1, Britney L Moss 2, Alexander R Leydon 1, Hardik P Gala 1, Amy Lanctot 1, Román Ramos 1, Hannah Klaeser 2, Andrew C Lemmex 1, Mollye L Zahler 1, Jennifer L Nemhauser 1, R Clay Wright 3,
PMCID: PMC6658840  PMID: 31372596

Abstract

Thousands of sequenced genomes are now publicly available capturing a significant amount of natural variation within plant species; yet, much of these data remain inaccessible to researchers without significant bioinformatics experience. Here, we present a webtool called ViVa (Visualizing Variation) which aims to empower any researcher to take advantage of the amazing genetic resource collected in the Arabidopsis thaliana 1001 Genomes Project (http://1001genomes.org). ViVa facilitates data mining on the gene, gene family, or gene network level. To test the utility and accessibility of ViVa, we assembled a team with a range of expertise within biology and bioinformatics to analyze the natural variation within the well‐studied nuclear auxin signaling pathway. Our analysis has provided further confirmation of existing knowledge and has also helped generate new hypotheses regarding this well‐studied pathway. These results highlight how natural variation could be used to generate and test hypotheses about less‐studied gene families and networks, especially when paired with biochemical and genetic characterization. ViVa is also readily extensible to databases of interspecific genetic variation in plants as well as other organisms, such as the 3,000 Rice Genomes Project ( http://snp-seek.irri.org/) and human genetic variation ( https://www.ncbi.nlm.nih.gov/clinvar/).

Keywords: accessibility, Arabidopsis thaliana, genome diversity, genotype‐phenotype, natural variation, structure‐function

1. INTRODUCTION

The sequencing of the first Arabidopsis thaliana genome ushered in a new era of tool development and systematic functional annotation of plant genes (The Arabidopsis Genome Initiative 2000). Since that landmark effort, massive scaling of sequencing technology has allowed for the survey of genomic variation in natural A. thaliana populations (Borevitz et al., 2007; Nordborg et al., 2005; Weigel & Mott, 2009). This valuable population genetics resource has led to several associations of genetic loci with phenotypic traits and provided insights into how selective pressure has influenced the evolution of plant genomes (Atwell et al., 2010; Clark et al., 2007; Long et al., 2013).

Beyond its utility in gene discovery and understanding genome evolution, natural genetic variation provides a catalog of permissible polymorphisms that can facilitate the connection of genotype to phenotype at the gene, gene family, and network scales (Joly‐Lopez, Flowers, & Purugganan, 2016; Nieduszynski & Liti, 2011). This is an especially critical resource for studying large gene families where loss of function in individual genes may have little or no phenotypic effect (Dharmasiri et al., 2005; Guo, 2013; Moore & Purugganan, 2005) and directed allele replacement remains time and resource‐intensive (Chen, Wang, Zhang, Zhang, & Gao, 2019). Natural variation datasets provide novel alleles and germplasm which can be examined with biochemical and genetic approaches to map sequence to function and genotype to phenotype. In human clinical medicine, massively parallel assays of variant effects stand to revolutionize genetic diagnostics and personalized medicine (Gasperini, Starita, & Shendure, 2016; Matreyek, Stephany, & Fowler, 2017; Starita et al., 2017). Similarly, we envision the use of plant natural variation datasets as a tool to revolutionize the breeding and genetic engineering of crop plants by rapidly advancing our understanding of genotype/function/phenotype relationships. A proof‐of‐principle survey of a relatively small subset of natural variants paired with a synthetic assay of gene function successfully mapped critical functional domains of auxin receptors and identified new alleles which affect plant phenotype (Wright, Zahler, Gerben, & Nemhauser, 2017).

Why is the survey of natural variants not as routine as a BLAST search or ordering T‐DNA insertion mutants? One reason may be the current requirement for a fairly high level of bioinformatics expertise to extract the desired information from whole‐genome resequencing datasets. While existing resources such as the 1001 Proteomes website (Joshi et al., 2012) and ePlant (Waese et al., 2017) facilitate access to these data at the gene scale, they do not provide comparative summaries or visualizations of variation at the gene family scale. To address this concern, we created ViVa: a webtool and Rpackage for Visualizing Variation, which allows plant molecular biologists of any level access to gene‐level data from the 1001 Genomes database. Using ViVa, researchers may: (a) Identify polymorphisms to facilitate biochemical assays of variant effects (Starita et al., 2017; Wright et al., 2017); (b) produce family‐wise alignments of variants to facilitate de novo functional domain identification (Melamed, Young, Miller, & Fields, 2015); (c) generate lists of accessions containing polymorphisms to facilitate phenotypic analysis of gene variant effects (Park et al., 2017); and (d) quantify metrics of genetic diversity to facilitate the study of gene, gene family, and network evolution (Delker et al., 2010; Kliebenstein, 2008). Here we present a summary of the functionality of ViVa and an analysis of the natural variation in the nuclear auxin signaling network using ViVa. To succinctly demonstrate the use of ViVa, we focus on the analysis of the Aux/IAA family; similar analyses were performed for the other nuclear auxin signaling gene families and are provided as Section 5 for the interested reader.

2. METHODS

2.1. Data sources

2.1.1. Variant data

Variant data were queried from the 1001 genomes project ( http://1001genomes.org) via URL requests to their API service ( http://tools.1001genomes.org/api/index.html). These queries returned subsets of the whole‐genome variant call format (VCF) file as SnpEFF VCF files. The whole‐genome VCF file can be found on the project's website at http://1001genomes.org/data/GMI-MPI/releases/v3.1/.

2.1.2. Germplasm accession information

A dataset of each of the 1,135 accessions including CS stock numbers and geographic locations where the samples were collected was retrieved from the 1001 Genomes website at http://1001genomes.org/accessions.html, via the download link at the bottom of the page. These data file have been embedded in the R package as accessions.

2.1.3. Gene and transcript accession information

Information on the genes and transcripts including chromosomal coordinates, start and end location, and transcript length were downloaded from Araport11 as a general feature format (GFF) file (Cheng et al., 2017). The Araport11 full genome general feature format file, which can also be found on the TAIR website ( https://www.arabidopsis.org/download/index-auto.jsp?dir=%2Fdownload_files%2FGenes%2FAraport11_genome_release), has been embedded in the R package as GRanges object, gr. The TAIR10 database, found at http://arabidopsis.org, was accessed via the biomart function, using the biomaRt R package. Gene identifiers used in this study are in Table 1.

Table 1.

Full list of genes used in this study, by identifier, symbol, and classification

Arabidopsis AGI locus identifier Gene symbol Clade/class
AT3G62980 TIR1 NA
AT4G03190 AFB1 NA
AT3G26810 AFB2 NA
AT1G12820 AFB3 NA
AT4G24390 AFB4 NA
AT5G49980 AFB5 NA
AT2G39940 COI1 NA
AT4G14560 IAA1 A
AT3G23030 IAA2 A
AT1G04240 IAA3 A
AT5G43700 IAA4 A
AT1G15580 IAA5 A
AT1G52830 IAA6 A
AT3G23050 IAA7 A
AT2G22670 IAA8 A
AT5G65670 IAA9 A
AT1G04100 IAA10 B
AT4G28640 IAA11 B
AT1G04550 IAA12 B
AT2G33310 IAA13 B
AT4G14550 IAA14 A
AT1G80390 IAA15 A
AT3G04730 IAA16 A
AT1G04250 IAA17 A
AT1G51950 IAA18 B
AT3G15540 IAA19 A
AT2G46990 IAA20 C
AT3G16500 IAA26 B
AT4G29080 IAA27 A
AT5G25890 IAA28 B
AT4G32280 IAA29 C
AT3G62100 IAA30 C
AT3G17600 IAA31 C
AT2G01200 IAA32 C
AT5G57420 IAA33 C
AT1G15050 IAA34 C
AT1G15750 TPL NA
AT1G80490 TPR1 NA
AT3G16830 TPR2 NA
AT5G27030 TPR3 NA
AT3G15880 TPR4 NA
AT1G59750 ARF1 B
AT5G62000 ARF2 B
AT2G33860 ARF3 B
AT5G60450 ARF4 B
AT1G19850 ARF5 A
AT1G30330 ARF6 A
AT5G20730 ARF7 A
AT5G37020 ARF8 A
AT4G23980 ARF9 B
AT2G28350 ARF10 C
AT2G46530 ARF11 B
AT1G34310 ARF12 B
AT1G34170 ARF13 B
AT1G35540 ARF14 B
AT1G35520 ARF15 B
AT4G30080 ARF16 C
AT1G77850 ARF17 C
AT3G61830 ARF18 B
AT1G19220 ARF19 A
AT1G35240 ARF20 B
AT1G34410 ARF21 B
AT1G34390 ARF22 B
AT1G43950 ARF23 B

2.2. Ranking of variant functional effects

Alignments were colored according to the strongest effect variant allele occurring at any frequency at that position as reported in the SnpEFF “effect” field, per the scale in Figure 1.

Figure 1.

Figure 1

Rank order of the strength of functional effects variant effect classes were ordered by subjective prediction of average strength of effect on gene function. Strength was then assigned to each effect on an integer scale

2.3. Nucleotide diversity calculation

Nei and Li defined the nucleotide diversity statistic in their original paper as: “the average number of nucleotide differences per site between two randomly chosen DNA sequences” (Nei & Li, 1979), and provided the equation:

π=ijxixjπij. (1)

where x i is the frequency of the ith sequence in the population and π ij is the number of sites that are different between the ith and jth sequence divided by sequence length.

A more general form that treats each sequence in the population as unique can be written as follows:

π=1Ln2i=1nj=1nk=1Lπi,j,kπijk=1ifNikNjk0ifNik=Njk (2)

where N ik is the nucleotide (A, T, C, or G) at position k on the ith sequence of the population. L is the length of the sequence. Indels are excluded from the diversity calculation leading to a single L for the population. n is the total number of sequences in the population.

From this form, we can re‐arrange summations to the form the below equation:

π=1Lk=1Lπkπk=1n2i=1nj=1nπijk (3)

where π k can be thought of as the site‐wise nucleotide diversity at position k, and is equal to the nucleotide diversity of a sequence of length 1 at location k. We can calculate π k for each site, then average those over the sequence length to calculate π, the nucleotide diversity of the sequence.

The function Nucleotide_diversity in the r1001genomes package calculates π k for each position in the gene or region that contains a variant. Note, π k is equal to 0 at all locations without variants. This is also what is displayed in the Diversity Plot tab of the webtool.

2.4. Detailed π k calculation simplification

The formula for π k above requires comparing every sequence to every other sequence at location k; however, we know there are only a few variant forms at each individual location.

So, we can revert back to using Nei and Li's original formula 1, modifying it slightly, replacing x i with nin, n i being the number of sequences in the population with nucleotide N i at location k:

πk=ijninnjnπij=1n2ijninjπijπij(k)=1ifij0ifi=j (4)

Note that in Equation 1, subscripts i and j are summed over all sequences in the population; however, in Equation 4 i and j are only summed over unique variants at a particular location k.

We will define n !i = n − n i as the number of sequences different from i at position k. We can also see that the summed term will be zero if i = j, and n i n j if i ≠ j. Therefore:

πk=1n2inin!i (5)

Next we substitute our definition of n !i:

πk=1n2ini(n-ni) (6)

Distributing and splitting summation yields:

πk=1n2(nini-ini2) (7)

Finally, summing Σi n i is equal to n:

πk=1n2(n2-ini2) (8)

This simplified form for π k is used by the app, because the counts of unique variants at a single nucleotide location can easily be summarized in R.

2.5. Software

The r1001genomes package has many software dependencies on other R packages, a few of the key bioinformatics packages used are listed below.

biomaRt: used for accessing the TAIR10 database on arabidopsis.org.

vcfR: used to read in the VCF files in a flat “tidy” format for easy manipulation.

BSgenome: used as the source for the complete DNA string of the reference genome (Col‐0).

DECIPHER: used to align nucleotide and amino acid sequences of homologous genes.

GenomicFeatures: used for handling sequence annotations.

Biostrings: provides the underlying framework for the sequence manipulations used for generating and aligning sequences with BSgenome, Decipher, and GenomicFeatures.

Other packages that were critical to building ViVa and/or writing this document include the following: (Allaire, Ushey, & Tang, 2018; Allaire et al., 2018; Aphalo, 2018a,2018b; Arnold, 2018; Bache & Wickham, 2014; Garnier, 2018a,2018b; Hamm & Wright, 2018; Heibl, 2014; Henry & Wickham, 2018; Ihaka et al., 2019; Lang, 2018; Müller, 2018; Müller & Wickham, 2019; Müller, Wickham, James, & Falcon, 2018; Neuwirth, 2014; Pagès & Aboyoun, 2018; Pagès, Aboyoun, Gentleman, & DebRoy, 2019; Pagès, Aboyoun, & Lawrence, 2018; Pagès, Lawrence, & Aboyoun, 2018; Paradis et al., 2018; R Core Team 2018; Team 2018; Wagih, 2017; Wickham, 2016, 2017a,2017b, 2018a,2018b,2018c; Wickham, François, Henry, & Müller, 2018; Wickham & Henry, 2018; Wickham, Hester, & Francois, 2018; Wickham et al., 2018; Wright, 2019; Xie, 2018a,2018b; Xie, Cheng, & Tan, 2018; Yu, 2018; Yu & Lam, 2019).

2.6. Testing

A group consisting of undergraduate and graduate students, postdoctoral, and postbaccalaureate researchers (the authors of this publication) were assembled to test the functionality of ViVa. Testers were asked to use the ViVa web interface to analyze the genetic variation at the clade or whole‐gene family level of the auxin nuclear signaling pathway. Testers were provided with a brief vignette on how to use ViVa to formulate new hypotheses or support existing hypotheses about gene function and evolution, which has been expanded to the “An Overview of ViVa” section. The testers had background knowledge of these genes as members of laboratories studying auxin nuclear signaling. Testers met weekly with the developers to discuss their user experience and their findings. Tester experiences, issues, and suggestions were incorporated into the ViVa software. The results collected by these analyses are summarized in the “Visualizing Variation within the auxin signaling pathway” section and Section 5.

3. RESULTS AND DISCUSSION

3.1. An overview of ViVa

ViVa, in this first iteration, is meant to visualize natural variation in the coding sequences of genes or gene families. Noncoding sequence variation is intentionally excluded from the analysis tools. This reflects the challenges both in alignment of noncoding sequences and the increased difficulty in assessing the effects of variation in these regions (Alexandre et al., 2018).

The stable version of ViVa is hosted at https://www.plantsynbiolab.bse.vt.edu/ViVa/. The development version of ViVa can be accessed as a Docker container https://hub.docker.com/r/wrightrc/r1001genomes/or as an R package at https://github.com/wrightrc/r1001genomes.

3.2. Gene select and annotation files

At the top of the ViVa webtool are two collapsible panels used for entering the genes to query and custom annotations for those genes (Figures 2a and 3a). The Gene Select panel permits gene input by either typing in or uploading a ".csv" file of AGI/TAIR locus identifiers. The Annotation Files panel is optionally used to upload an annotation file containing coordinates of domains, mutations, or any other sequence knowledge that can be plotted on some of the tabs of the ViVa analysis tabs.

Figure 2.

Figure 2

Key elements of the webtool (a) The first section contains two collapsible panels, gene select and annotation files, which are used to input information about the genes to be investigated. (b) The SNP Stats tab provides gene‐structure level counts and statistics on SNPs. (c) The Diversity Plot tab plots the nucleotide diversity of SNP sites along the length of the coding region of a selected gene. (d) The SNP Mapping tab plots accessions on a world map colored according to the selected set of SNPs. (e) The SNP Browser tab allows variants and accessions to be filtered by any combination of text and numeric fields. (f) The Alignments tab aligns DNA and amino acid sequences of homologous genes and colors sequence elements based on SNPs and annotations

Figure 3.

Figure 3

ViVa workflow (a) Workflow diagram of ViVa. Blue indicates user actions, yellow indicates processing steps performed by the application. (b) Detailed look at mapping tab parameters: User selects which genes to look at (1) then clicks the Submit button (2). The “Allele selection” panel is then filled in with all non‐reference variants meeting the criteria. The user can adjust the range of nucleotide diversity and the type of SNP with a slider and radio buttons (3) to make the list of variants a manageable size. The list of variants is updated as changes are made to these controls. The user then selects variants to display on the map (4). Clicking the Update Map button (5) populates the map below with points located at the collection coordinates of each accession and colored by the selected alleles. (c) Detailed look at the browser tab options: The user first selects which genes to analyze and clicks submit (1). The “hide 0|0 genotype?” checkbox (2) removes rows from the table containing the reference allele. Four configurable filters of two types are provided. The first filter type is text matching (3); the user selects a column to filter from a drop‐down menu then enters one or more text strings to match in that column. The second type of filter is numeric range matching (4); the user again selects a column, then specifies a range to match by typing or selecting minimum and maximum values. After configuring the filters, the user clicks the Apply Filters button (5) to update the table. The table is updated such that only rows that meet all filter conditions are displayed

Below the data input section, the rest of the webtool is divided into several tabs containing interactive output.

3.3. SNP stats: Summary of gene information, structure, and diversity

The SNP Stats tab provides general information on the gene transcripts being queried, as well as calculated counts/statistics on the content of variants found in the sample population (Figure 2b). The first table of this section is the basic information about the transcripts, including TAIR locus and symbol, the chromosomal start and end position, and the transcript length. This information was collected from the Araport11 Official Release (06/2016) annotation dataset (Cheng et al., 2017).

The next two tables provide counts of SNPs across the gene body for each transcript. The Total Polymorphism Counts tab provides the total number of variant observations of nonreference alleles categorized by type and location (the Col‐0 accession TAIR9 genome is the reference genome for this dataset). The Unique Allele Count tab only counts the number of unique variant alleles within the population of accessions (e.g., if multiple accessions have the same variant, these are counted as a single allele at that position).

The Nucleotide Diversity Statistics tab provides nucleotide diversity statistic (π) values for the transcript and the coding sequence of each gene (Nei & Li, 1979). Given a set of nucleotide sequences from a population, π is the average number of nucleotide differences per site. Nucleotide diversity is also calculated for the set of only synonymous (π S) and only nonsynonymous polymorphisms (π N). As nonsynonymous polymorphisms are more likely to give rise to functional change than synonymous polymorphisms, the ratio of the presence of nonsynonymous to synonymous polymorphisms provides a measure of the potential for functional diversity (Firnberg & Ostermeier, 2013; Whitehead et al., 2012). We present π N/π S, the ratio of nonsynonymous to synonymous diversity, here as a correlate for functional diversity throughout ViVa (Hughes, Green, Garbayo, & Roberts, 2000; Nelson, Moncla, & Hughes, 2015). While imperfect, this metric may be suggestive of functional constraint when π N/π S ≫ 1 (Hughes, 1999).

3.4. Diversity Plot: Visualize allelic diversity across the coding sequence

The Diversity Plot tab shows the nucleotide diversity of each variant in the coding region of a selected gene (Figures 2c and 4). Although the X‐axis is marked by codon number from the N‐terminus for interpretability, the diversity values are based on single‐nucleotide sites. The colors of markers on the plot identify the effect of the polymorphism. If annotation files are provided, the background of the plot is color‐coded by the annotated regions. If points on the plot are selected by clicking and dragging a box over them, the data for the selected points appear in the grey box below the plot. Below these are the complete data table containing all points on the plot which can be downloaded as a ".csv" file. This tab allows users to identify regions of high diversity as well as isolate polymorphisms that may affect gene function and exist in multiple accessions, facilitating phenotypic analysis.

Figure 4.

Figure 4

IAA6 Diversity Plot. Nucleotide diversity of variant positions throughout the IAA6 coding sequence are plotted and colored according to the effect of the variant alleles at each position. The region of positive selection identified by Winkler et al. is highlighted

3.5. SNP Mapping: View distributions of SNPs across the globe

The SNP Mapping tab plots the accessions’ collection locations on a world map and colors the points based on selected variant alleles (Figure 2d). After the user selects genes and filters based on the SNP type and level of nucleotide diversity, a group of checkboxes becomes available to select variant alleles to display on the map (Figures 3b and 5). The variant alleles are labeled with the Transcript_ID and Amino_Acid_Change fields, in the form [Transcript_ID|Amino_Acid_Change]. After selecting the variant alleles and updating the map, the accessions are plotted on the map colored by each unique combination of the selected alleles. Below the map is a table containing the accession details for all mapped accessions. This tab may help users formulate hypotheses about the relatedness of accessions sharing a common allele and environments in which that allele may be favorable.

Figure 5.

Figure 5

Map of AFB1 Oligomerization domain variant accessions. (a) Map showing the two accessions with variants in the AFB1 oligomerization domain (b) The user selectable parameters of the mapping tab used to generate the map are provided as an example of using the SNP Mapping tab in the ViVa webtool. See Figure 3 for details on filling in the parameters

3.6. SNP Browser: Filter and search for variants

The SNP Browser tab provides a way to search and filter the variant data by different fields (Figure 2e). After selecting the transcripts to include, a number of filters can be applied to the dataset to match text values (e.g., gene name or variant effect) or set minimum and maximum limits on the values of numeric fields (e.g., nucleotide diversity; 3c). When these filters are applied, the table below is updated to only contain rows meeting the criteria for all filters. This tab can be useful for identifying all accessions with a particular allele, or any non‐reference alleles in a particular region of a gene that may not have been easily accessible in another tab.

3.7. Alignments: Visualize SNPs on alignments of homologous genes

The Alignments tab provides DNA and amino acid sequence alignments of selected genes, colored according to the variant allele with the strongest functional effect at each position (Figures 2f and 6, for a full description of the color scale see Figure 1). The content of this tab is most useful if the selected genes are all family members or have significant sequence homology. If annotation files are uploaded, open boxes will be overlaid on the alignment, colored by annotation. Hovering the cursor over variants will provide additional details about the alleles present at that locus. This tab facilitates family‐wise analysis of functional conservation, allowing users to identify potential functional regions and alleles which may be useful in deciphering this function.

Figure 6.

Figure 6

Critical functional domains of the conserved Aux/IAA genes show low nonsynonymous variation compared to regions of unknown functional importance. Protein sequences were aligned with DECIPHER (Wright, 2015) and variants were mapped to this alignment and colored according to the predicted functional effect of the allele of strongest effect at that position, with light colors having weaker effects on function and darker colors stronger effects. Red indicates missense variants. Color scale is further explained in Section 2. Alignment consensus is shown in grayscale underneath the plot as measured by evolutionary trace (Płuciennik et al., 2018; Wilkins, Erdin, Lua, & Lichtarge, 2012), with high consensus positions in black and low consensus in white. Key functional domains are outlined in black and labeled above the plot. The EAR domain spans codon alignment positions 70–74, corresponding to IAA1 amino acids LRLGL, 14–18. The degron domain spans alignment positions 194–201, corresponding to IAA1 amino acids QIVGWPPV, 55–62. The charged residues of the PB1 domain correspond to alignment positions 246, 256, 316, 318, 320, and 326 corresponding to IAA1 amino acids K77, R88, D133, D135, D137, and D143

3.8. Gene Tree: Visualize functional diversity and sequence divergence of a gene family

The Gene Tree tab provides a neighbor joining tree (or uploaded tree created by the user) for the selected genes with the tips of the tree mapped with predicted functional diversity as represented by π N/π S in the 1001 Genomes dataset (Figure 7). This tab allows users to generate hypotheses regarding functional diversity and redundancy within the context of the predicted evolution of the gene family.

Figure 7.

Figure 7

IAA protein sequence phylogenetic tree mapped with πNS reveals patterns of sister pair diversity/conservation. Protein sequences were aligned with DECIPHER (Wright, 2015) and low‐information content regions were masked with Aliscore (Kück et al., 2010) prior to inferring a phylogeny with MrBayes (Ronquist & Huelsenbeck, 2003). Tips of the tree are mapped with circles of color and diameter proportional to πNS. πNS statistic provides a prediction of functional diversity. Nodes are labeled with the posterior probability of monophyly, a measure of confidence in the branch assignment, with one representing high confidence and zero, low confidence. There are two distinct clades of Aux/IAAs represented by the majority of the A and B classes. C class Aux/IAAs are missing one or more of the canonical Aux/IAA domains. These classes are represented by the text color of the gene name

3.9. ViVa R package: Programmatic access to ViVa's functionality

All of the functionalities of the ViVa webtool are implemented through functions within the ViVa R package. In addition to being able to generate the same sets of figures and tables as in the webtool, users of the R package also gain direct access to the underlying data structures, providing greater control over parameters when processing and visualizing the data. The ViVa R package is intended for users familiar with R programming who want to extend the capabilities of the webtool. The ViVa R package can be found at https://github.com/wrightrc/r1001genomes and can be installed in your R environment via the devtools package: remotes::install_github(“wrightrc/r1001genomes”).

3.10. Visualizing Variation within the auxin signaling pathway

To test the usability and accessibility of ViVa, we assembled a group of alpha testers comprising postdoctoral, graduate, and undergraduate researchers at a research university (University of Washington) and at a primarily undergraduate institution (Whitman College). Our testers focused their investigation of natural variation on the nuclear auxin signaling pathway. We selected this signaling pathway for multiple reasons including a wealth of functional data and solved structures of several domains or entire proteins. Using this existing knowledge, we were able to qualitatively assess the predictive ability of the ViVa modules. Below, we describe the results for detailed analysis of the Aux/IAA family in more detail as a brief vignette of ViVa use. Analysis of each gene family in the nuclear auxin signaling network can be found in the Section 5.

The Aux/IAA family plays a critical role in transmitting auxin signals. Aux/IAA degradation is triggered by auxin accumulation (Zenser, Ellsmore, Leasure, & Callis, 2001) and is mediated by ubiquitination via a SKP‐Cullin‐F‐box ubiquitin ligase complex containing an Auxin‐signaling F‐box (AFB) auxin receptor protein (Gray, Kepinski, Rouse, Leyser, & Estelle, 2001). Variation in the AFB family is presented in Section 5 Figures 5, 8, 9). The Aux/IAAs repress transcription of Auxin Response Factor (ARF) bound genes (Tiwari, Hagen, & Guilfoyle, 2004), via recruitment of TOPLESS (TPL) and TOPLESS‐related (TPR) co‐repressors (Szemenyei, Hannon, & Long, 2008). Variation in the TPL/TPR family is presented in Section 5 Figures 10 and 11). Thus Aux/IAA degradation relieves this repression allowing ARFs to activate transcription of auxin response genes (Tiwari, Hagen, & Guilfoyle, 2003). Variation in the ARF family is presented in Section 5 Figures 12 and 13).

Figure 8.

Figure 8

Auxin‐signaling F‐box protein sequence tree mapped with πNS. Protein sequences were aligned with DECIPHER (Wright, 2015) and low‐information content regions were masked with Aliscore (Kück et al., 2010) prior to inferring a phylogeny with MrBayes (Ronquist & Huelsenbeck, 2003). Tips of the tree are mapped with circles of diameter proportional to πNS. Nodes are labeled with the posterior probability of monophyly

Figure 9.

Figure 9

Alignment of the auxin‐signaling F‐box family protein sequences were aligned with DECIPHER (Wright, 2015) and variants were mapped to this alignment and colored according to the predicted functional effect of the allele of strongest effect at that position, with light colors having weaker effects on function and darker colors stronger effects. Red indicates missense variants. Color scale is explained in Section 2. In grayscale underneath the plot, alignment consensus is shown as measured by evolutionary trace (Płuciennik et al., 2018; Wilkins et al., 2012), with high‐consensus positions in black and low consensus in white

Figure 10.

Figure 10

TPL protein sequence tree mapped with πNS. Protein sequences were aligned with DECIPHER (Wright, 2015) and low‐information content regions were masked with Aliscore (Kück et al., 2010) prior to inferring a phylogeny with MrBayes (Ronquist & Huelsenbeck, 2003). Tips of the tree are mapped with circles of diameter proportional to πNS and also are colored according to πNS. Nodes are labeled with the poster probability of monophyly

Figure 11.

Figure 11

Alignment of the TPL/TPR family. Protein sequences were aligned with DECIPHER (Wright, 2015) and variants were mapped to this alignment and colored according to the predicted functional effect of the allele of strongest effect at that position, with light colors having weaker effects on function and darker colors stronger effects. Red indicates missense variants. Color scale is explained in Section 2. In grayscale underneath the plot, alignment consensus is shown as measured by evolutionary trace (Płuciennik et al., 2018; Wilkins et al., 2012), with high‐consensus positions in black and low consensus in white

Figure 12.

Figure 12

Auxin response factor protein sequence tree mapped with πNS. Protein sequences were aligned with DECIPHER (Wright, 2015) and low‐information content regions were masked with Aliscore (Kück et al., 2010) prior to inferring a phylogeny with MrBayes (Ronquist & Huelsenbeck, 2003). Tips of the tree are mapped with circles of diameter proportional to πNS and also are colored according to πNS. Nodes are labeled with the poster probability of monophyly

Figure 13.

Figure 13

Alignment of the full auxin response factor family. Protein sequences were aligned with DECIPHER (Wright, 2015) and variants were mapped to this alignment and colored according to the predicted functional effect of the allele of strongest effect at that position, with light colors having weaker effects on function and darker colors stronger effects. Red indicates missense variants. Color scale is explained in Section 2. In grayscale underneath the plot, alignment consensus is shown as measured by Evolutionary trace (Płuciennik et al., 2018; Wilkins et al., 2012), with high‐consensus positions in black and low consensus in white

For all class A ARFs, the middle region of the protein was the predominant high‐diversity region (Figure 13). In the analyzed natural variation, ARF7 had several expansions of polyglutamine sequences in the middle region. Polyglutamine regions are known to readily expand and contract throughout evolutionary time due to replication error, and variation in polyglutamine length can have phenotypic consequences and be acted on by natural selection (Press, Carlson, & Queitsch, 2014). The ARF DNA‐binding domain had very few, low‐diversity missense mutations, as did the critical residues of the PB1 domain. Considering the necessity of their conserved functions, the low level of variation in these key DNA and protein‐protein interaction domains is expected.

To predict the functional impact of variation in gene coding sequences, ViVa uses the frequency of nonsynonymous and synonymous polymorphisms. In most cases, nonsynonymous polymorphisms in critical functional domains are likely to have deleterious effects on gene function (Hughes et al., 2000). Therefore, domains which are critical to plant fitness will accumulate fewer nonsynonymous polymorphism than regions which are noncritical domains. Thus, we reasoned that scanning gene coding sequences for regions of relatively low nonsynonymous diversity should highlight functional domains. This general principle can be seen clearly in the analysis of the Aux/IAA family of transcriptional co‐repressors/co‐receptors. Most Aux/IAAs have three major domains. Domain I contains an EAR motif that facilitates interaction with TPL/TPR transcriptional co‐repressors (Szemenyei et al., 2008; Tiwari et al., 2004). Domain II, the degron, facilitates interactions with the TIR1/AFB receptors in the presence of auxin (Tan et al., 2007). Domain III (which was originally considered domains III and IV) is a PB1 domain and facilitates interactions with the ARF transcription factors (Guilfoyle & Hagen, 2012; Korasick et al., 2014; Nanao et al., 2014; Ulmasov, Murfett, Hagen, & Guilfoyle, 1997).

We focused our analysis on the most conserved A class Aux/IAAs, which possess all three of these domains. We began by mapping the natural genetic variation onto an alignment of the coding sequences via the ViVa Alignment tab. A similar visualization of the full gene family is presented in the Section 5 (Figure 14). In the alignment of A class Aux/IAAs, the EAR motif and degron can be readily identified by the drop in nonsynonymous variation, as visualized by the lack of strong variant functional effects (Figure 6). The PB1 domain is not as readily identified, perhaps because the multiple key residues are spread out in linear sequence space. It is worth noting that these key residues that facilitate electrostatic PB1‐PB1 interactions show little variation.

Figure 14.

Figure 14

Alignment of the complete Aux/IAA family. Protein sequences were aligned with DECIPHER (Wright, 2015) and variants were mapped to this alignment and colored according to the predicted functional effect of the allele of strongest effect at that position, with light colors having weaker effects on function and darker colors stronger effects. Red indicates missense variants. Color scale is further explained in Section 2. Alignment consensus is shown in grayscale underneath the plot as measured by evolutionary trace (Płuciennik et al., 2018; Wilkins et al., 2012), with high‐consensus positions in black and low consensus in white

Natural variation also provides a means to study how gene families are evolving. To visualize this, we used ViVa to map the diversity at nonsynonymous variant sites relative to synonymous variant sites (π N/π S) onto the Aux/IAA phylogenetic tree (Figure 7). This visualization enables straightforward comparison of rates of recent functional divergence as predicted by diversity within natural variation in the context of the sequence divergence throughout the history of a gene family. Clades and individual genes exhibiting low‐nonsynonymous diversity are likely functionally conserved. Conversely, genes with high‐nonsynonymous diversity are more likely to be under relaxed functional selection, indicating the possibility of functional drift, emergence of novel function, or pseudogenization.

Previous research has found evidence of both broad genetic redundancy among the Aux/IAAs and also specificity within closely related pairs or groups of Aux/IAA proteins (Overvoorde et al., 2005; Winkler et al., 2017). For example, the iaa8‐1 iaa9‐1 double mutant and the iaa5‐1 iaa6‐1 iaa19‐1 triple mutant have wild‐type phenotypes (Overvoorde et al., 2005), yet the IAA6/IAA19 sister pair has significant differences in expression patterns, protein abundances, and functions suggesting they have undergone functional specialization since their divergence (Winkler et al., 2017). A closer examination of the IAA19 and IAA6 pair within Brassicaceae found evidence for positive selection and subfunctionalization of IAA6 relative to IAA19 (Winkler et al., 2017). Consistent with these results, ViVa revealed higher conservation (i.e., lower ratio of nonsynonymous to synonymous diversity) for IAA19 (π N/π S = 0.55) compared to IAA6 (π N/π S = 2.3; Figure 7), and also detected high nonsynonymous diversity within the same regions of IAA6 as seen by Winkler et al. (Figure 4). This pattern—one sister showing high nonsynonymous diversity while the other sister was more conserved—was observed frequently across the Aux/IAA as well as the AFB and ARF families (Figures 8 and 12), suggesting this could be a recurring feature in the evolution of these families supporting the large diversity in auxin functions.

The Aux/IAA phylogeny clusters into two distinct clades represented by the A and B classes (Remington, Vision, Guilfoyle, & Reed, 2004). The C class Aux/IAAs are missing one or more of the canonical Aux/IAA domains. We found notable exceptions to the pattern of diversification and conservation between sister pairs within the Class B Aux/IAA genes. The IAA10/IAA11, IAA18/IAA26, and IAA20/IAA30 pairs showed similar levels of nonsynonymous diversity. For example, IAA10 and IAA11 both showed functional conservation (π N/π S of 0.80 and 0.67, respectively). The A. thaliana ePlant browser indicates that IAA10 and IAA11 have almost identical expression patterns (Waese et al., 2017). Together this evidence suggests a strong dosage requirement for these genes or that they have taken on novel functions since their emergence, with both genes contributing similar to plant fitness. In support of novel function, expression and mutant analysis during embryogenesis suggest that IAA10 is required for suspensor‐hypophysis transition, while IAA11 is involved in later cell fate transitions (Rademacher et al., 2012).

4. CONCLUSION

ViVa has allowed our team of testers from various skill levels and backgrounds to meaningfully access and mine the 1001 genomes dataset. The visualizations of natural variation further supported much of the existing structure‐function knowledge of the well‐studied nuclear auxin signaling pathway and facilitated the generation of new hypotheses. Gene and gene family analyses can be combined, as was done here for nuclear auxin signaling, to understand variation within gene networks. Visualizations of variation within whole‐gene networks are planned for future iterations of ViVa. Application of ViVa to less‐studied genes and gene families promises to yield more novel hypotheses, which can be evaluated via genetic and functional assays to glean novel structure/function knowledge from this rich dataset.

ViVa results are intended to inform and inspire hypothesis generation, not be taken as absolute evidence of trends in gene or gene family evolution. Among the cautions worth noting in interpreting results are limitations of short‐read sequencing that lead to regions of missing data where low‐read quality may have prevented variant calls. We have assumed these missing variants are reference alleles, leading to undercounting in ViVa's diversity estimations. Visualizations of this uncertainty will be added to a future version of ViVa. Recent advances in sequencing technologies have been combined to generate extremely high‐quality genomes (Michael et al., 2018), and will reduce this source of uncertainty in future resequencing datasets. Another limitation is that the geographic coverage of accessions in the 1001 Genomes dataset is far from uniform, and thus diversity scores may not accurately reflect the allelic distributions of the global A. thaliana population.

We hope that ViVa will advance understanding of genotype‐phenotype relationships by allowing all researchers access to large resequencing datasets. In the future, we intend to expand ViVa beyond the plant genetics workhorse, A. thaliana, to more agriculturally relevant species with existing resequencing projects, such as rice (Wang et al., 2018) and soybean (Zhou et al., 2015). Indeed, the ViVa framework is readily adaptable to any source of targeted resequencing data. If François Jacob's metaphor holds true, and evolution is indeed a tinkerer and not an engineer (Jacob, 1977), it is only by examining the largest possible number of nature's solutions that we may eventually decipher the principles constraining innovations in form and function.

5. SUPPLEMENTAL ANALYSES

5.1. Additional natural variation in the Aux/IAA genes

For simplicity we have included only the alignment of the class A Aux/IAAs in the main manuscript. We include here the complete alignment of the family (Figure 14).

5.2. Natural variation in the TIR1/AFB genes

Auxin acts by binding to receptors (Auxin‐signaling F‐Boxes, or AFBs) that in turn target co‐repressors (Aux/IAAs) for degradation. The six auxin receptor genes in the model plant A. thaliana, TIR1 and AFB1‐5, evolved through gene duplication and diversification early in the history of vascular plants (Parry et al., 2009). The rate of co‐repressor degradation is determined by the identity of both the receptor and co‐repressor (Havens et al., 2012), and this rate sets the pace of lateral root development (Guseman et al., 2015).

All members of this family have been shown to bind auxin and Aux/IAA proteins. However, AFB1 has a drastically reduced ability to assemble into an SCF complex, due to the substitution E8K in its F‐box domain, preventing it from inducing degradation of Aux/IAAs (Yu et al., 2015). This lack of SCF formation may allow for the high and ubiquitous AFB1 accumulation observed in Arabidopsis tissues (Parry et al., 2009). Higher order receptor mutants in the family containing afb1 mutants suggest that AFB1 has a moderate positive effect on auxin signaling (Dharmasiri et al., 2005). Additionally, AFB4 and AFB5 have been shown to preferentially and functionally bind the synthetic auxin picloram, while other family members preferentially bind indole‐3‐acetic acid (Prigge et al., 2016). Interestingly, the strength and rate with which TIR1/AFBs are able to bind and mark Aux/IAAs for degradation are variable (Calderón Calderón Villalobos et al., 2012; Havens et al., 2012). AFB2 induces the degradation of certain Aux/IAA proteins at a faster rate than TIR1, suggesting some functional specificity has arisen since the initial duplication between the TIR1/AFB1 and AFB2/AFB3 clades.

Examining the natural sequence variation across the AFB family revealed that TIR1 and AFB1 both had very low nonsynonymous diversity (Figure 8), hinting at their likely functional importance and bringing in to question the inconclusive role of AFB1 in auxin signaling. AFB3 and AFB4 had higher nonsynonymous diversity, while their sister genes, AFB2 and AFB5 were more conserved. This matches our current understanding of AFB3 as playing a minor role in the auxin signaling pathway (Dharmasiri et al., 2005). Two frameshift variants and one stop‐gained (nonsense) variant were observed in AFB4 supporting its pseudogenization, suggesting that AFB4 may be undergoing pseudogenization, especially when paired with its low‐expression levels (Prigge et al., 2016). AFB4 and AFB5 have an N‐terminal extension prior to their F‐box domains. This extension had very‐high nonsynonymous diversity (Figure 9), suggesting that this extension does not play an important functional role in these proteins.

Although most known functional regions are highly conserved in AFB1, there are some nonsynonymous polymorphism in the oligomerization domain that are only present in single accessions (F125E in Can‐0 and I163N in Pu2‐23, as shown in Figure 5). Mutations in this domain of TIR1 frequently have a semidominant effect on root phenotypes (Dezfulian et al., 2016; Wright et al., 2017). Characterization of this allele and accession may help determine the role of AFB1 in this pathway.

5.3. Natural variation in the TPL/TPR genes

The auxin signaling pathway utilizes the TOPLESS (TPL) and TOPLESS‐related (TPR) family of Gro/TLE/TUP1 type co‐repressor proteins to maintain auxin responsive genes in a transcriptionally repressed state in the absence of auxin (Szemenyei et al., 2008). In A. thaliana the five member TPL/TPR family includes TPL and TPR1‐4. The resulting proteins are comprised of three structural domains: an N‐terminal TPL domain and two WD‐40 domains (Long, Ohno, Smith, & Meyerowitz, 2006). TPL/TPR proteins are recruited to the AUX/IAA proteins through interaction with the conserved ethylene‐responsive element binding factor‐associated amphiphilic repression (EAR) domain (Szemenyei et al., 2008). Canonical EAR domains have the amino acid sequence LxLxL, as found in most AUX/IAAs (Overvoorde et al., 2005). TPL/TPR co‐repressors bind EAR domains via their C‐terminal to LisH (CTLH) domains found near their N‐termini (Long et al., 2006). Recent structural analyses of the TPL N‐terminal domain have highlighted the precise interaction interface between TPL and AUX/IAA EAR domains, as well as the TPL‐TPL dimerization and tetramerization motifs (Ke et al., 2015; Martin‐Arevalillo et al., 2017). The residues required for higher order multimers of TPL tetramers have also been identified (Ma et al., 2017). Additional interactions with transcriptional regulation and chromatin‐modifying machinery are likely mediated by two tandem beta propeller domains of TPL/TPRs.

The TOPLESS co‐repressor family generally exhibits a high level of sequence conservation at the amino acid sequence level across resequenced A. thaliana accessions, with all π N/π S values below 1 (Figure 10). The closely related TPL and TPR1 have the highest π N/π S values, suggesting that these two related genes tolerate a higher degree of sequence and potentially functional diversity compared to TPR2/3/4. The N‐terminal TPL domain of the TPL/TPR family is particularly conserved. All nonsynonymous polymorphisms observed in this region are either in the coils between helices or are highly conservative mutations within helices (i.e., valine to isoleucine), which would be predicted to exhibit little effect on folding and function.

The high degree of conservation in the entire N‐terminal domain underscores its importance in TPL/TPR function (Figure 11). For example, the initial tpl‐1 mutation (N176H) in the ninth helix is a dominant gain‐of‐function allele (Long et al., 2006), which is capable of binding wild‐type TPL protein and inducing protein aggregation (Ma et al., 2017). It is therefore understandable that this helix had very low diversity as nonsynonymous variants in this domain could act in a dominant negative fashion.

5.4. Natural variation in the ARF genes

Auxin response is mediated by the auxin responsive transcription factors (ARFs). There are 23 ARFs in A. thaliana that are divided into three phylogenetic classes. Class A ARFs (ARF5, ARF6, ARF7, ARF8, and ARF19) activate transcription. These ARFs have a glutamine‐rich region in the middle of the protein that may mediate activation (Guilfoyle & Hagen, 2007). It has recently been shown that the middle region of ARF5 interacts with the SWI/SNF chromatin remodeling ATPases BRAMA and SPLAYED, possibly to reduce nucleosome occupancy and allow for the recruitment of transcription machinery (Wu et al., 2015). Additionally, ARF7 interacts with Mediator subunits, directly tethering transcriptional activation machinery to its binding sites in the chromosome (Ito et al., 2016). Class B and C ARFs are historically categorized as repressor ARFs, although the mechanism through which they confer repression has not been identified. Their middle regions tend to be proline‐ and serine‐rich.

Canonical ARFs are comprised of three major domains. Recent crystallization of these domains have informed structure‐function analysis of the ARFs (Boer et al., 2014; Korasick et al., 2014; Nanao et al., 2014). These domains are conserved throughout land plants (Mutte et al., 2018). ARFs share an N‐terminal B3 DNA binding domain. Flanking this DNA‐binding domain is a dimerization domain, which folds up into a single “taco‐shaped” domain to allow for dimerization between ARFs. There is an auxiliary domain that immediately follows and interacts with the dimerization domain. The middle region is the most variable between ARFs, as mentioned above, but is characterized by repetitive units of glutamine (class A), serine, or proline residues (classes B and C). The C‐terminal domain of canonical ARFs is a PB1 protein‐protein interaction domain mediating interactions among ARFs, between ARFs and other transcription factors, and between ARFs and the Aux/IAA repressors. This interaction domain was recently characterized as a Phox and Bem1 (PB1) domain, which is comprised of a positive and negative face with conserved basic and acidic residues, respectively (Korasick et al., 2014; Nanao et al., 2014). The dipolar nature of the PB1 domain may mediate multimerization by the pairwise interaction of these faces on different proteins as the ARF7 PB1 domain was crystallized as a multimer (Korasick et al., 2014). However, it is unclear whether ARF multimerization occurs or plays a significant role in vivo. Interfering with ARF dimerization in either the DNA‐binding proximal dimerization domain or the PB1 domain decreases the ability of class A ARFs to activate transcription in a heterologous yeast system (Pierre‐Jerome, Moss, Lanctot, Hageman, & Nemhauser, 2016).

While domain architecture is broadly conserved among the ARFs, there are exceptional cases. Three ARFs do not contain a PB1 domain at all, ARF3, ARF13, and ARF17, and several more have lost the conserved acidic or basic residues in the PB1 domain, suggesting they may be reduced to a single interaction domain. Several ARFs additionally have an expanded conserved region within the DNA‐binding domain, of unknown function. The majority of domain variation among ARFs occurs in the large B‐class subfamily. The liverwort Marchantia polymorpha has a single representative ARF of each class (Flores‐Sandoval, Magnus Eklund, & Bowman, 2015). The expansion of these classes in flowering plants is the result of both whole genome and tandem duplication events (Remington et al., 2004). The growth of the ARF family may have allowed for the expansion of the quantity and complexity of loci regulated by the ARFs and subsequent expansion in their regulation of developmental processes (Mutte et al., 2018).

Class A ARFs are the most well‐studied ARF subfamily—the five family members all act as transcriptional activators and have well‐characterized, distinct developmental targets. Overall the diversity of class A ARFs was generally low, especially compared to the classes B and C ARFs (Figure 12), suggesting that class A ARFs are central to auxin signal transduction and plant development. Analysis of class A ARF nonsynonymous diversity suggests that the majority of these ARFs are highly functionally conserved, with π N/π S values much lower than 1 with the exception of ARF19, with π N/π S value of 1.8. Comparing diversity within sister pairs, there is a similar trade‐off as seen in most IAA sister pairs, with one sister being highly conserved and the other more divergent. ARF19 and ARF8 are the more divergent class A ARFs, with π N/π S values at least three time those of their sisters, ARF7 and ARF6, respectively. This may suggest that ARF6 and ARF7 serve more essential purposes in plant development.

Many of the class B ARFs have very high π N/π S ratios relative to the other ARFs. ARF23 has a truncated DNA‐binding domain and had a high π N/π S value of 4.1. ARF13 has many high‐diversity nonsense variants and lacks a C‐terminal PB1 domain. This high level of diversity, prevalance of high‐frequency nonsense variants and frequent loss of critical domains, may suggest that several genes in this class are undergoing pseudogenization.

There are also a few highly conserved class B ARFs. The high conservation of ARF1 and ARF2 is expected as they play critical, redundant roles in senescence and abscission (Ellis et al., 2005). Little is known about ARF9, however, and its low nonsynonymous diversity maybe worthy of investigation.

Class C ARFs show low‐nucleotide diversity scores, with all π N/π S values substantially lower than 1. ARF16 was the most conserved, whereas the other clade members (ARF10, ARF17) had scores at least four times higher. Structurally, all three members of Class C ARFs contain a canonical B3 DNA‐binding domain, but only ARF10 and ARF16 contain a PB1 domain. The DNA‐binding domains exhibited overall low diversity. Of the PB1 domain containing class C ARFs, ARF16's PB1 domain exhibited several missense variants which are sporadically distributed, in contrast to the conserved PB1 domain of ARF10. This conservation in the PB1 domain of ARF10 and the DBD of ARF16 may suggest subfunctionalization in this class.

AUTHOR CONTRIBUTIONS

RCW and JLN designed the research; MOH and RCW developed the software; MOH, BLM, ARL, HPG, AL, RR, HK, ACL, MLZ, and RCW tested the software, performed the research, and analyzed the data; MOH, BLM, ARL, HPG, AL, RR, HK, ACL, MLZ, JLN, and RCW interpreted the data and wrote the paper.

CONFLICT OF INTEREST

The authors declare no conflict of interest associated with the work described in this manuscript.

Supporting information

ACKNOWLEDGMENTS

The authors would like to thank Oghenemega Okolo for assistance testing the ViVa software, and Song Li and Bo Zhang for helpful comments on the manuscript. This work was supported by the National Institute of Health (R01‐GM107084), the National Science Foundation (IOS‐1546873), and the Howard Hughes Medical Institute. R.C.W. received fellowship support from the National Science Foundation (DBI‐1402222). B.L.M. and H.K. received support from the M.J. Murdock Charitable Trust. A.R.L. is a Simons Foundation Fellow of the Life Sciences Research Foundation. A.L. was supported by an NSF Graduate Research Fellowship DGE‐1256082.

Hamm MO, Moss BL, Leydon AR, et al. Accelerating structure‐function mapping using the ViVa webtool to mine natural variation. Plant Direct. 2019;3:1–20. 10.1002/pld3.147

This manuscript was previously deposited as a BioRxiv preprint at https://doi.org/10.1101/488395

REFERENCES

  1. Alexandre, C. M. , Urton, J. R. , Jean‐Baptiste, K. , Huddleston, J. , Dorrity, M. W. , Cuperus, J. T. , … Queitsch, C. (2018). Complex relationships between chromatin accessibility, sequence divergence, and gene expression in Arabidopsis thaliana . Molecular Biology and Evolution, 35(4), 837–854. 10.1093/molbev/msx326 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Allaire, J. J. , Ushey, K. , & Tang, Y. (2018). Reticulate: Interface to ‘Python’. Retrieved from https://CRAN.R-project.org/package=reticulate [Google Scholar]
  3. Allaire, J. J. , Xie, Y. , McPherson, J. , Luraschi, J. , Ushey, K. , Atkins, A. , … Iannone, R. (2018). Rmarkdown: Dynamic documents for R. Retrieved from https://CRAN.R-project.org/package=rmarkdown [Google Scholar]
  4. Aphalo, P. J. (2018a). Gginnards: Explore the innards of ‘Ggplot2’ objects. Retrieved from https://CRAN.R-project.org/package=gginnards [Google Scholar]
  5. Aphalo, P. J. (2018b). Ggpmisc: Miscellaneous extensions to ‘Ggplot2’. Retrieved from https://CRAN.R-project.org/package=ggpmisc [Google Scholar]
  6. Arnold, J. B. (2018). Ggthemes: Extra themes, scales and geoms for ‘Ggplot2’. Retrieved from https://CRAN.R-project.org/package=ggthemes [Google Scholar]
  7. Atwell, S. , Huang, Y. S. , Vilhjálmsson, B. J. , Willems, G. , Horton, M. , Li, Y. , … Nordborg, M. (2010). Genome‐wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature, 465(7298), 627–631. 10.1038/nature08800 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bache, S. M. , & Wickham, H. (2014). Magrittr: A forward‐pipe operator for R. Retrieved from https://CRAN.R-project.org/package=magrittr [Google Scholar]
  9. Boer, D. R. , Freire‐Rios, A. , van den Berg, W. A. , Saaki, T. , Manfield, I. W. , Kepinski, S. , … Coll, M. (2014). Structural basis for DNA binding specificity by the auxin‐dependent ARF transcription factors. Cell, 156(3), 577–589. 10.1016/j.cell.2013.12.027 [DOI] [PubMed] [Google Scholar]
  10. Borevitz, J. O. , Hazen, S. P. , Michael, T. P. , Morris, G. P. , Baxter, I. R. , Hu, T. T. , … Ecker, J. R. (2007). Genome‐wide patterns of single‐feature polymorphism in Arabidopsis Thaliana . Proceedings of the National Academy of Sciences of the United States of America, 104(29), 12057–12062. 10.1073/pnas.0705323104 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Calderón Villalobos, L. I. , Lee, S. , De Oliveira, C. , Ivetac, A. , Armitage, L. , Sheard, L. B. , … Estelle, M. (2012). A combinatorial TIR1/AFB‐Aux/IAA co‐receptor system for differential sensing of auxin. Nature Chemical Biology, 8(5), 477–485. 10.1038/nchembio.926 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Chen, K. , Wang, Y. , Zhang, R. , Zhang, H. , & Gao, C. (2019). CRISPR/Cas genome editing and precision plant breeding in agriculture. Annual Review of Plant Biology, 70(1), 667–697. 10.1146/annurev-arplant-050718-100049 [DOI] [PubMed] [Google Scholar]
  13. Cheng, C.‐Y. , Krishnakumar, V. , Chan, A. P. , Thibaud‐Nissen, F. , Schobel, S. , & Town, C. D. (2017). Araport11: A complete reannotation of the Arabidopsis thaliana reference genome. The Plant Journal, 89(4), 789–804. 10.1111/tpj.13415 [DOI] [PubMed] [Google Scholar]
  14. Clark, R. M. , Schweikert, G. , Toomajian, C. , Ossowski, S. , Zeller, G. , Shinn, P. , … Weigel, D. (2007). Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana . Science, 317(5836), 338–342. 10.1126/science.1138632 [DOI] [PubMed] [Google Scholar]
  15. Delker, C. , Pöschl, Y. , Raschke, A. , Ullrich, K. , Ettingshausen, S. , Hauptmann, V. , … Quint, M. (2010). Natural variation of transcriptional auxin response networks in Arabidopsis thaliana . The Plant Cell, 22(7), 2184–2200. 10.1105/tpc.110.073957 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Dezfulian, M. H. , Jalili, E. , Roberto, D. K. , Moss, B. L. , Khoo, K. , Nemhauser, J. L. , & Crosby, W. L. (2016). Oligomerization of SCF TIR1 is essential for Aux/IAA degradation and auxin signaling in Arabidopsis. PLoS Genetics, 12(9), e1006301 10.1371/journal.pgen.1006301 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Dharmasiri, N. , Dharmasiri, S. , Weijers, D. , Lechner, E. , Yamada, M. , Hobbie, L. , … Estelle, M. (2005). Plant development is regulated by a family of auxin receptor F box proteins. Developmental Cell, 9(1), 109–119. 10.1016/j.devcel.2005.05.014 [DOI] [PubMed] [Google Scholar]
  18. Ellis, C. M. , Nagpal, P. , Young, J. C. , Hagen, G. , Guilfoyle, T. J. , & Reed, J. W. (2005). AUXIN RESPONSE FACTOR1 and AUXIN RESPONSE FACTOR2 regulate senescence and floral organ abscission in Arabidopsis thaliana . Development, 132(20), 4563–4574. 10.1242/dev.02012 [DOI] [PubMed] [Google Scholar]
  19. Firnberg, E. , & Ostermeier, M. (2013). The genetic code constrains yet facilitates Darwinian evolution. Nucleic Acids Research, 41(15), 7420–7428. 10.1093/nar/gkt536 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Flores‐Sandoval, E. , Magnus Eklund, D. , & Bowman, J. L. (2015). A simple auxin transcriptional response system regulates multiple morphogenetic processes in the liverwort Marchantia polymorpha . PLoS Genetics, 11(5), e1005207 10.1371/journal.pgen.1005207 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Garnier, S. (2018a). Viridis: Default color maps from ‘Matplotlib’. Retrieved from https://CRAN.R-project.org/package=viridis [Google Scholar]
  22. Garnier, S. (2018b). ViridisLite: Default color maps from ‘Matplotlib’ (lite version). Retrieved from https://CRAN.R-project.org/package=viridisLite [Google Scholar]
  23. Gasperini, M. , Starita, L. , & Shendure, J. (2016). The power of multiplexed functional analysis of genetic variants. Nature Protocols, 11(10), 1782–1787. 10.1038/nprot.2016.135 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Gray, W. M. , Kepinski, S. , Rouse, D. , Leyser, O. , & Estelle, M. (2001). Auxin regulates SCF(TIR1)‐dependent degradation of AUX/IAA proteins. Nature, 414(6861), 271–276. 10.1038/35104500 [DOI] [PubMed] [Google Scholar]
  25. Guilfoyle, T. J. , & Hagen, G. (2007). Auxin response factors. Current Opinion in Plant Biology, 10(5), 453–460. 10.1016/j.pbi.2007.08.014 [DOI] [PubMed] [Google Scholar]
  26. Guilfoyle, T. J. , & Hagen, G. (2012). Getting a grasp on domain III/IV responsible for auxin response FactorIAA protein interactions. Plant Science, 190, 82–88. 10.1016/j.plantsci.2012.04.003 [DOI] [PubMed] [Google Scholar]
  27. Guo, Y.‐L. (2013). Gene family evolution in green plants with emphasis on the origination and evolution of Arabidopsis thaliana genes. The Plant Journal, 73(6), 941–951. 10.1111/tpj.12089 [DOI] [PubMed] [Google Scholar]
  28. Guseman, J. M. , Hellmuth, A. , Lanctot, A. , Feldman, T. P. , Moss, B. L. , Klavins, E. , … Nemhauser, J. L. (2015). Auxin‐induced degradation dynamics set the pace for lateral root development. Development, 142(5), 905–909. 10.1242/dev.117234 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Hamm, M. O. , & Wright, R. C. (2018). R1001genomes: Access and analyze the 1001 genomes Arabidopsis. Resequencing Dataset. [Google Scholar]
  30. Havens, K. A. , Guseman, J. M. , Jang, S. S. , Pierre‐Jerome, E. , Bolten, N. , Klavins, E. , & Nemhauser, J. L. (2012). A synthetic approach reveals extensive tunability of auxin signaling. Plant Physiology, 160(1), 135–142. 10.1104/pp.112.202184 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Heibl, C. (2014). Ips: Interfaces to phylogenetic software in R. Retrieved from https://CRAN.R-project.org/package=ips. [Google Scholar]
  32. Henry, L. , & Wickham, H. (2018). Purrr: Functional programming tools. Retrieved from https://CRAN.R-project.org/package=purrr. [Google Scholar]
  33. Hughes, A. L. (1999). Adaptive evolution of genes and genomes. Oxford: Oxford University Press. [Google Scholar]
  34. Hughes, A. L. , Green, J. A. , Garbayo, J. M. , & Roberts, R. M. (2000). Adaptive diversification within a large family of recently duplicated, placentally expressed genes. Proceedings of the National Academy of Sciences of the United States of America, 97(7), 3319–3323. 10.1073/pnas.97.7.3319 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Ihaka, R. , Murrell, P. , Hornik, K. , Fisher, J. C. , Stauffer, R. , Wilke, C. O. , … Zeileis, A. (2019). Colorspace: A toolbox for manipulating and assessing colors and palettes. Retrieved from https://CRAN.R-project.org/package=colorspace [Google Scholar]
  36. Ito, J. , Fukaki, H. , Onoda, M. , Li, L. , Li, C. , Tasaka, M. , & Furutani, M. (2016). Auxin‐dependent compositional change in Mediator in ARF7‐ and ARF19‐mediated transcription. Proceedings of the National Academy of Sciences of the United States of America, 113, 6562–6567. 10.1073/pnas.1600739113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Jacob, F. (1977). Evolution and tinkering. Science, 196(4295), 1161–1166. 10.1126/science.860134 [DOI] [PubMed] [Google Scholar]
  38. Joly‐Lopez, Z. , Flowers, J. M. , & Purugganan, M. D. (2016). Developing maps of fitness consequences for plant genomes. Current Opinion in Plant Biology, 30, 101–107. 10.1016/j.pbi.2016.02.008 [DOI] [PubMed] [Google Scholar]
  39. Joshi, H. J. , Christiansen, K. M. , Fitz, J. , Cao, J. , Lipzen, A. , Martin, J. , … Heazlewood, J. L. (2012). 1001 Proteomes: A functional proteomics portal for the analysis of Arabidopsis thaliana accessions. Bioinformatics, 28(10), 1303–1306. 10.1093/bioinformatics/bts133 [DOI] [PubMed] [Google Scholar]
  40. Ke, J. , Ma, H. , Gu, X. , Thelen, A. , Brunzelle, J. S. , Li, J. , … Melcher, K. (2015). Structural basis for recognition of diverse transcriptional repressors by the TOPLESS family of corepressors. Science Advances, 1(6), e1500107 10.1126/sciadv.1500107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Kliebenstein, D. J. (2008). A role for gene duplication and natural variation of gene expression in the evolution of metabolism. PLoS ONE, 3(3), e1838 10.1371/journal.pone.0001838 [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Korasick, D. A. , Westfall, C. S. , Lee, S. G. , Nanao, M. H. , Dumas, R. , Hagen, G. , … Strader, L. C. (2014). Molecular basis for AUXIN RESPONSE FACTOR protein interaction and the control of auxin response repression. Proceedings of the National Academy of Sciences of the United States of America, 111(14), 5427–5432. 10.1073/pnas.1400074111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Kück, P. , Meusemann, K. , Dambach, J. , Thormann, B. , von Reumont, B. M. , Wägele, J. W. , & Misof, B. (2010). Parametric and non‐parametric masking of randomness in sequence alignments can be improved and leads to better resolved trees. Frontiers in Zoology, 7, 10 10.1186/1742-9994-7-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Lang, D. T. , & CRAN Team (2018). XML: Tools for parsing and generating Xml within R and S‐Plus. Retrieved from https://CRAN.R-project.org/package=XML [Google Scholar]
  45. Long, J. A. , Ohno, C. , Smith, Z. R. , & Meyerowitz, E. M. (2006). TOPLESS regulates apical embryonic fate in Arabidopsis. Science, 312(5779), 1520–1523. 10.1126/science.1123841 [DOI] [PubMed] [Google Scholar]
  46. Long, Q. , Rabanal, F. A. , Meng, D. , Huber, C. D. , Farlow, A. , Platzer, A. , … Nordborg, M. (2013). Massive genomic variation and strong selection in Arabidopsis thaliana lines from Sweden. Nature Genetics, 45(8), 884–890. https://doi.org/10.1038/ng.2678 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Ma, H. , Duan, J. , Ke, J. , He, Y. , Gu, X. , Xu, T. H. , … Melcher, K. (2017). A D53 repression motif induces oligomerization of TOPLESS corepressors and promotes assembly of a corepressor‐nucleosome complex. Science Advances, 3(6), e1601217 10.1126/sciadv.1601217 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Martin‐Arevalillo, R. , Nanao, M. H. , Larrieu, A. , Vinos‐Poyo, T. , Mast, D. , Galvan‐Ampudia, C. , … Parcy, F. (2017). Structure of the Arabidopsis TOPLESS corepressor provides insight into the evolution of transcriptional repression. Proceedings of the National Academy of Sciences of the United States of America, 114, 8107–8112. 10.1073/pnas.1703054114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Matreyek, K. A. , Stephany, J. J. , & Fowler, D. M. (2017). A platform for functional assessment of large variant libraries in mammalian cells. Nucleic Acids Research, 45(11), e102 10.1093/nar/gkx183 [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Melamed, D. , Young, D. L. , Miller, C. R. , & Fields, S. (2015). Combining natural sequence variation with high throughput mutational data to reveal protein interaction sites. PLoS Genetics, 11(2), e1004918 10.1371/journal.pgen.1004918 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Michael, T. P. , Jupe, F. , Bemm, S. , Motley, S. T. , Sandoval, J. P. , Lanz, C. , … Ecker, J. R. (2018). High contiguity arabidopsis thaliana genome assembly with a single nanopore flow cell. Nature Communications, 9(1), 541 10.1038/s41467-018-03016-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Moore, R. C. , & Purugganan, M. D. (2005). The evolutionary dynamics of plant duplicate genes. Current Opinion in Plant Biology, 8(2), 122–128. 10.1016/j.pbi.2004.12.001 [DOI] [PubMed] [Google Scholar]
  53. Müller, K. (2018). Bindrcpp: An ‘Rcpp’ interface to active bindings. Retrieved from https://CRAN.R-project.org/package=bindrcpp [Google Scholar]
  54. Müller, K. , & Wickham, H. (2019). Tibble: Simple data frames. Retrieved from https://CRAN.R-project.org/package=tibble [Google Scholar]
  55. Müller, K. , Wickham, H. , James, D. A. , & Falcon, S. (2018). RSQLite: ‘SQLite’ interface for R. Retrieved from https://CRAN.R-project.org/package=RSQLite [Google Scholar]
  56. Mutte, S. K. , Kato, H. , Rothfels, C. , Melkonian, M. , Wong, G. K.‐S. , & Weijers, D. (2018). Origin and evolution of the nuclear auxin response system. eLife, 7, e33399 10.7554/elife.33399 [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Nanao, M. H. , Vinos‐Poyo, T. , Brunoud, G. , Thévenon, E. , Mazzoleni, M. , Mast, D. , … Dumas, R. (2014). Structural basis for oligomerization of auxin transcriptional regulators. Nature Communications, 5, 3617 10.1038/ncomms4617 [DOI] [PubMed] [Google Scholar]
  58. Nei, M. , & Li, W. H. (1979). Mathematical model for studying genetic variation in terms of restriction endonucleases. Proceedings of the National Academy of Sciences of the United States of America, 76(10), 5269–5273. 10.1073/pnas.76.10.5269 [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Nelson, C. W. , Moncla, L. H. , & Hughes, A. L. (2015). SNPGenie: Estimating evolutionary parameters to detect natural selection using pooled next‐generation sequencing data. Bioinformatics, 31(22), 3709–3711. 10.1093/bioinformatics/btv449 [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Neuwirth, E. (2014). RColorBrewer: ColorBrewer palettes. Retrieved from https://CRAN.R-project.org/package=RColorBrewer [Google Scholar]
  61. Nieduszynski, C. A. , & Liti, G. (2011). From sequence to function: Insights from natural variation in budding yeasts. Biochimica et Biophysica Acta, 1810(10), 959–966. 10.1016/j.bbagen.2011.02.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Nordborg, M. , Hu, T. T. , Ishino, Y. , Jhaveri, J. , Toomajian, C. , Zheng, H. , … Bergelson, J. (2005). The pattern of polymorphism in Arabidopsis thaliana . PLoS Biology, 3(7), e196 10.1371/journal.pbio.0030196 [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Overvoorde, P. J. , Okushima, Y. , Alonso, J. M. , Chan, A. , Chang, C. , Ecker, J. R. , … Theologis, A. (2005). Functional genomic analysis of the AUXIN/INDOLE‐3‐ACETIC ACID gene family members in Arabidopsis thaliana . The Plant Cell, 17(12), 3282–3300. 10.1105/tpc.105.036723 [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Pagès, H. , & Aboyoun, P. (2018). XVector: Representation and manipulation of external sequences. Retrieved from 10.18129/B9.bioc.XVector [DOI] [Google Scholar]
  65. Pagès, H. , Aboyoun, P. , Gentleman, R. , & DebRoy, S. (2019). Biostrings: Efficient anipulation of biological strings. Retrieved from 10.18129/B9.bioc.Biostrings [DOI] [Google Scholar]
  66. Pagès, H. , Aboyoun, P. , & Lawrence, M. (2018). IRanges: Infrastructure for manipulating intervals on sequences. Retrieved from 10.18129/B9.bioc.IRanges [DOI] [Google Scholar]
  67. Pagès, H. , Lawrence, M. , & Aboyoun, P. (2018). S4Vectors: S4 implementation of vector‐like and list‐like objects. Retrieved from 10.18129/B9.bioc.S4Vectors [DOI] [Google Scholar]
  68. Paradis, E. , Blomberg, S. , Bolker, B. , Brown, J. , Claude, J. , Cuong, H. S. , … de Vienne D.. (2018). Ape: Analyses of phylogenetics and evolution. Retrieved from https://CRAN.R-project.org/package=ape [Google Scholar]
  69. Park, B. , Rutter, M. T. , Fenster, C. B. , Symonds, V. V. , Ungerer, M. C. , & Townsend, J. P. (2017). Distributions of mutational effects and the estimation of directional selection in divergent lineages of Arabidopsis thaliana . Genetics, 206(4), 2105–2117. 10.1534/genetics.116.199190 [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Parry, G. , Calderon‐Villalobos, L. I. , Prigge, M. , Peret, B. , Dharmasiri, S. , Itoh, H. , … Estelle, M. (2009). Complex regulation of the TIR1/AFB family of auxin receptors. Proceedings of the National Academy of Sciences of the United States of America, 106(52), 22540–22545. 10.1073/pnas.0911967106 [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Pierre‐Jerome, E. , Moss, B. L. , Lanctot, A. , Hageman, A. , & Nemhauser, J. L. (2016). Functional analysis of molecular interactions in synthetic auxin response circuits. Proceedings of the National Academy of Sciences of the United States of America, 113(40), 11354–11359. 10.1073/pnas.1604379113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Płuciennik, A. , Stolarczyk, M. , Bzówka, M. , Raczyńska, A. , Magdziarz, T. , & Góra, A. (2018). BALCONY: An R package for MSA and functional compartments of protein variability analysis. BMC Bioinformatics, 19(1), 300 10.1186/s12859-018-2294-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Press, M. O. , Carlson, K. D. , & Queitsch, C. (2014). The overdue promise of short tandem repeat variation for heritability. Trends in Genetics: TIG, 30(11), 504–512. 10.1016/j.tig.2014.07.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Prigge, M. J. , Greenham, K. , Zhang, Y. , Santner, A. , Castillejo, C. , Mutka, A. M. , … Estelle, M. (2016). The arabidopsis auxin receptor F‐box proteins AFB4 and AFB5 are required for response to the synthetic auxin picloram. G3: Genes, Genomes, Genetics, 6(5), 1383–1390. 10.1534/g3.115.025585 [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. R Core Team (2018). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; https://www.R-project.org/. [Google Scholar]
  76. Rademacher, E. H. , Lokerse, A. S. , Schlereth, A. , Llavata‐Peris, C. I. , Bayer, M. , Kientz, M. , … Weijers, D. (2012). Different auxin response machineries control distinct cell fates in the early plant embryo. Developmental Cell, 22(1), 211–222. 10.1016/j.devcel.2011.10.026 [DOI] [PubMed] [Google Scholar]
  77. Remington, D. L. , Vision, T. J. , Guilfoyle, T. J. , & Reed, J. W. (2004). Contrasting modes of diversification in the Aux/IAA and ARF gene families. Plant Physiology, 135(3), 1738–1752. 10.1104/pp.104.039669 [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Ronquist, F. , & Huelsenbeck, J. P. (2003). MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics, 19(12), 1572–1574. 10.1093/bioinformatics/btg180 [DOI] [PubMed] [Google Scholar]
  79. Starita, L. M. , Ahituv, N. , Dunham, M. J. , Kitzman, J. O. , Roth, F. P. , Seelig, G. , … Fowler, D. M. (2017). Variant interpretation: Functional assays to the rescue. The American Journal of Human Genetics, 101(3), 315–325. 10.1016/j.ajhg.2017.07.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Szemenyei, H. , Hannon, M. , & Long, J. A. (2008). TOPLESS mediates auxin‐dependent transcriptional repression during Arabidopsis embryogenesis. Science, 319(5868), 1384–1386. 10.1126/science.1151461 [DOI] [PubMed] [Google Scholar]
  81. Tan, X. , Calderon‐Villalobos, L. I. , Sharon, M. , Zheng, C. , Robinson, C. V. , Estelle, M. , & Zheng, N. (2007). Mechanism of auxin perception by the TIR1 ubiquitin ligase. Nature, 446(7136), 640–645. 10.1038/nature05731 [DOI] [PubMed] [Google Scholar]
  82. Team, The Bioconductor Dev . (2018). BiocGenerics: S4 generic functions for ioconductor. Retrieved from 10.18129/B9.bioc.BiocGenerics [DOI] [Google Scholar]
  83. The Arabidopsis Genome Initiative (2000). Analysis of the genome sequence of the flowering plant Arabidopsis Thaliana . Nature, 408(6814), 796–815. 10.1038/35048692 [DOI] [PubMed] [Google Scholar]
  84. Tiwari, S. B. , Hagen, G. , & Guilfoyle, T. (2003). The roles of auxin response factor domains in auxin‐responsive transcription. The Plant Cell, 15(2), 533–543. 10.1105/tpc.008417 [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Tiwari, S. B. , Hagen, G. , & Guilfoyle, T. J. (2004). Aux/IAA proteins contain a potent transcriptional repression domain. The Plant Cell, 16(2), 533–543. 10.1105/tpc.017384 [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Ulmasov, T. , Murfett, J. , Hagen, G. , & Guilfoyle, T. J. (1997). Aux/IAA proteins repress expression of reporter genes containing natural and highly active synthetic auxin response elements. The Plant Cell Online, 9(11), 1963–1971. 10.1105/tpc.9.11.1963 [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Waese, J. , Fan, J. , Pasha, A. , Yu, H. , Fucile, G. , Shi, R. , … Provart, N. J. (2017). ePlant: Visualizing and exploring multiple levels of data for hypothesis generation in plant biology. The Plant Cell, 29(8), 1806–1821. 10.1105/tpc.17.00073 [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Wagih, O. (2017). Ggseqlogo: A ‘Ggplot2’ extension for drawing publication‐ready sequence logos. Retrieved from https://CRAN.R-project.org/package=ggseqlogo [Google Scholar]
  89. Wang, W. , Mauleon, R. , Hu, Z. , Chebotarov, D. , Tai, S. , Wu, Z. , … Mansueto, L. (2018). Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature, 557(7703), 43 10.1038/s41586-018-0063-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Weigel, D. , & Mott, R. (2009). The 1001 genomes project for Arabidopsis thaliana . Genome Biology, 10(5), 107 10.1186/gb-2009-10-5-107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Whitehead, T. A. , Chevalier, A. , Song, Y. , Dreyfus, C. , Fleishman, S. J. , De Mattos, C. , … Baker, D. (2012). Optimization of affinity, specificity and function of designed influenza inhibitors using deep sequencing. Nature Biotechnology, 30(6), 543–548. 10.1038/nbt.2214 [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Wickham, H. (2016). Plyr: Tools for splitting, applying and combining data. Retrieved from https://CRAN.R-project.org/package=plyr [Google Scholar]
  93. Wickham, H. (2017a). Reshape2: Flexibly reshape data: A reboot of the reshape package. Retrieved from https://CRAN.R-project.org/package=reshape2 [Google Scholar]
  94. Wickham, H. (2017b). Tidyverse: Easily install and load the ‘tidyverse’. Retrieved from https://CRAN.R-project.org/package=tidyverse [Google Scholar]
  95. Wickham, H. (2018a). Forcats: Tools for working with categorical variables (factors). Retrieved from https://CRAN.R-project.org/package=forcats [Google Scholar]
  96. Wickham, H. (2018b). Scales: Scale functions for visualization. Retrieved from https://CRAN.R-project.org/package=scales [Google Scholar]
  97. Wickham, H. (2018c). Stringr: Simple, consistent wrappers for common string operations. Retrieved from https://CRAN.R-project.org/package=stringr [Google Scholar]
  98. Wickham, H. , Chang, W. , Henry, L. , Pedersen, T. L. , Takahashi, K. , Wilke, C. , & Woo, K. (2018). Ggplot2: Create elegant data visualisations using the grammar of graphics. Retrieved from https://CRAN.R-project.org/package=ggplot2 [Google Scholar]
  99. Wickham, H. , François, R. , Henry, L. , & Müller, K. (2018). Dplyr: A grammar of data manipulation. Retrieved from https://CRAN.R-project.org/package=dplyr [Google Scholar]
  100. Wickham, H. , & Henry, L. (2018). Tidyr: Easily tidy data with ‘spread()’ and ‘gather()’ functions. Retrieved from https://CRAN.R-project.org/package=tidyr [Google Scholar]
  101. Wickham, H. , Hester, J. , & Francois, R. (2018). Readr: Read rectangular text data. Retrieved from https://CRAN.R-project.org/package=readr [Google Scholar]
  102. Wilkins, A. , Erdin, S. , Lua, R. , & Lichtarge, O. (2012). Evolutionary trace for prediction and redesign of protein functional sites. Methods in Molecular Biology (Clifton, N.J.), 819, 29–42. 10.1007/978-1-61779-465-0_3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  103. Winkler, M. , Niemeyer, M. , Hellmuth, A. , Janitza, P. , Christ, G. , Samodelov, S. L. , … Calderón Villalobos, L. I. A. (2017). Variation in auxin sensing guides AUX/IAA transcriptional repressor ubiquitylation and destruction. Nature Communications, 8, 15706 10.1038/ncomms15706 [DOI] [PMC free article] [PubMed] [Google Scholar]
  104. Wright, E. S. (2015). DECIPHER: Harnessing local sequence context to improve protein multiple sequence alignment. BMC Bioinformatics, 16, 322 10.1186/s12859-015-0749-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  105. Wright, E. (2019). DECIPHER: Tools for curating, analyzing, and manipulating biological sequences. Retrieved from 10.18129/B9.bioc.DECIPHER [DOI] [Google Scholar]
  106. Wright, R. C. , Zahler, M. L. , Gerben, S. R. , & Nemhauser, J. L. (2017). Insights into the evolution and function of auxin signaling F‐Box proteins in Arabidopsis thaliana through synthetic analysis of natural variants. Genetics, 207(2), 583–591. 10.1534/genetics.117.300092 [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Wu, M.‐F. , Yamaguchi, N. , Xiao, J. , Bargmann, B. , Estelle, M. , Sang, Y. , & Wagner, D. (2015). Auxin‐regulated chromatin switch directs acquisition of flower primordium founder fate. eLife, 4, e09269 10.7554/elife.09269 [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Xie, Y. (2018a). Bookdown: Authoring books and technical documents with R markdown. Retrieved from https://CRAN.R-project.org/package=bookdown [Google Scholar]
  109. Xie, Y. (2018b). Knitr: A general‐purpose package for dynamic report generation in R. Retrieved from https://CRAN.R-project.org/package=knitr [Google Scholar]
  110. Xie, Y. , Cheng, J. , & Tan, X. (2018). DT: A wrapper of the Javascript library ‘Datatables’. Retrieved from https://CRAN.R-project.org/package=DT [Google Scholar]
  111. Yu, G. (2018). Treeio: Base classes and functions for phylogenetic tree input and output. Retrieved from https://guangchuangyu.github.io/software/treeio [Google Scholar]
  112. Yu, G. , & Lam, T. T.‐Y. (2019). Ggtree: An R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Retrieved from https://guangchuangyu.github.io/software/ggtree [Google Scholar]
  113. Yu, H. , Zhang, Y. , Moss, B. L. , Bargmann, B. O. , Wang, R. , Prigge, M. , … Estelle, M. (2015). Untethering the TIR1 auxin receptor from the SCF complex increases its stability and inhibits auxin response. Nature Plants, 1(3), 14030 10.1038/nplants.2014.30 [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Zenser, N. , Ellsmore, A. , Leasure, C. , & Callis, J. (2001). Auxin modulates the degradation rate of Aux/IAA proteins. Proceedings of the National Academy of Sciences of the United States of America, 98(20), 11795–11800. 10.1073/pnas.211312798 [DOI] [PMC free article] [PubMed] [Google Scholar]
  115. Zhou, Z. , Jiang, Y. , Wang, Z. , Gou, Z. , Lyu, J. , Li, W. , … Tian, Z. (2015). Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nature Biotechnology, 33(4), 408–414. 10.1038/nbt.3096 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials


Articles from Plant Direct are provided here courtesy of Wiley

RESOURCES