Table 1.
Comp | Explanation | Figure 8 usage | Figure 1 usage |
---|---|---|---|
Data | Data to visualize, containing variables and values | A gene expression table | A GRanges object (core data structure in Bioconductor) |
Geom | A geometric object draws the data as a graphical primitive. Types of primitives include points, lines, polygons or text. Some statistical or composite primitives, such as histogram, boxplot and point range, are considered to be geoms | Points with color indicating significance of expression (red = significant, black = not) | Alignments (new), Chevron (new) |
Stat | A statistical transformation transforms, filters and/or summarizes a variable prior to plotting. For example, binning and counting is necessary to make a histogram. The default would be an identity transformation, which does not change the data. In ggplot2 an appropriate default transformation is chosen according to the geom, for example, the bin transform for the histogram geom. Thus, the user rarely needs to explicitly specify one | Identity (computation of M value and A values is done outside of the grammar) | Steppings (new) |
Scales | A scale maps the variables (for example, expression, treatment, gene id) from data space to aesthetics (for example, position, color, area). Scales also control associated guides like axes and legends. Included in scales are numerical transformations such as log or square root of variables, so that an axis can be drawn on a log scale, for example. The default is a linear scale | A, the log geometric average, the x axis, and M, the log ratio mapped to the y axis | Genomic position mapped to position along x axis, and levels mapped to y axis |
Coord | A coordinate system controls how two position scales work together. The default is the Cartesian coordinate system, but others such as a polar coordinate system could be chosen | Cartesian | Cartesian |
Facet | A faceting specification is used to produce small multiples [42] for subsets of the data. In other graphical systems it is known as latticing [43], trellising [44] or even conditioning | None | None |
Layout (new) | A layout is a new grammatical component for controlling how multiple plots are arranged in a figure. It was motivated by the need to display multiple genomic annotation data sets simultaneously, and also supports genomic overviews | Single | Linear |
Components of the basic grammar of graphics, and the extended grammar, and how they are used in Figures 8 and 1. Figure 9 illustrates how the grammar has been extended for biological data. Entries marked with 'new' are those developed as part of this work; the rest are inherited from ggplot2.