2.5 Noteworthy ggplot details

It is instructive to explicate some details that characterize and distinguish commands in ggplot2 (Wickham et al., 2020) from other commands in the tidyverse (Wickham, 2019c):

  • ggplot() requires some data (in the right format) and maps independent variables (i.e., columns of data) to dimensions (e.g., the x- and y-axis) and dependent variables to geometric objects (called geoms). It typically assumes that the to-be-plotted data is a table (data frame or tibble) in long format and contains independent variables as factors.

  • The argument names data = and mappings = can be omitted, but an aesthetic mapping aes(<MAPPING>) for at least one geom is needed.

  • Different geoms can be stacked as multiple layers of a plot, but their order matters: Later layers are printed on top of earlier ones.

  • When multiple geoms use the same aesthetic mappings, their common aes(<MAPPING>) can be moved into the initial line of the ggplot() command (i.e., behind <DATA>).

  • In multi-line ggplot() calls, a sequence of commands is combined by the symbol +, rather than %>%, which is the forward pipe operator provided by the magrittr package (Bache & Wickham, 2014). The + symbol is placed at the end of every non-final line, rather than at the beginning of the next line. (See Section 3.3 for more information about the forward pipe operator.)

Both the content and the visual appearance of plots are highly customizable (e.g., by supplying aesthetic arguments, combining multiple geoms in one plot, using facets, adding labels and legends, and applying pre-defined themes). Tuning plots can be a lot of fun (see Section 4.2.11 for some examples), but always keep in mind your current goal and the plot’s intended audience.