2.5 Noteworthy ggplot details

The complexity and power of ggplot2 (Wickham, Chang, et al., 2024) can be confusing and overwhelming at first. But as the underlying “grammar of graphics” is highly systematic, it is helpful to identify its main concepts and some elements of its syntax (i.e., what goes where). Here are some noteworthy details that help composing visualizations with the ggplot() function:

  • Basics: Creating a plot with ggplot() requires some <DATA>, some geometric object (geom), and some mapping of variables to the elements of a particular type of plot:

    • The <DATA> used is typically a rectangular table (data frame or tibble) in long format, with categorical variables represented as factors.

    • An aesthetic mapping assigns variables (i.e., columns of <DATA>) to the dimensions of a plot (e.g., its \(x\)- or \(y\)-coordinates) or to the visual features of geometric objects (so-called geoms).

    • While the argument names data = and mappings = can be omitted, an aesthetic mapping aes(<MAPPING>) for at least one geom is required.

  • Geoms: The type of plot is determined by selecting geometric objects (geoms):

    • Different geoms require different aesthetic mappings.

    • Multiple geoms can be stacked as the layers of a plot, but their order matters: Later layers are printed on top of earlier ones.

    • In multi-line ggplot() calls, a sequence of functions is combined by the symbol +, rather than by %>%, which is the forward pipe operator provided by the magrittr package (Bache & Wickham, 2022). The + symbol is placed at the end of every non-final line, rather than at the beginning of the next line. (See Section 3.3 for information about the forward pipe operator.)

  • Aesthetics can be assigned on multiple levels and as arguments of different functions:

    • When multiple geoms share aesthetic mappings, their common aes(<MAPPING>) appears in the initial line of the ggplot() function (i.e., behind <DATA>). This first line contains global aesthetic mappings (applicable to all geoms), whereas individual geoms contain local aesthetic mappings (applicable to only this particular geom).

    • Aesthetic settings (like colors, sizes, shapes, etc.) can be assigned both within an aes() function and outside of aes(), as arguments to any geom_...(). The difference is that assignments within aes() are mappings (i.e., assign aesthetics to variables), whereas assignments outside of aes() do not involve variables (i.e., set aesthetics to specific values).

Both the content and the visual appearance of plots are highly customizable (e.g., by supplying aesthetic arguments, combining multiple geoms in one plot, using facets, adding labels and legends, and applying pre-defined themes). Tuning plots can be fun (see Section 4.2.11 for some examples), but we should always keep in mind our plot’s communicative goal and its intended audience.

References

Bache, S. M., & Wickham, H. (2022). magrittr: A forward-pipe operator for R. Retrieved from https://magrittr.tidyverse.org
Wickham, H., Chang, W., Henry, L., Pedersen, T. L., Takahashi, K., Wilke, C., … van den Brand, T. (2024). ggplot2: Create elegant data visualisations using the grammar of graphics. Retrieved from https://ggplot2.tidyverse.org