10.3 Basic Aesthetics

In R, the aes() function is often used within other graphing elements to specify the desired aesthetics. The aes() function can be used in a global manner (applying to all of the graph’s elements) by nesting within ggplot(). It can also be used for specific graph elements by nesting aes() in those specific geom functions (geom_point(), geom_line(), geom_errorbar(), etc.,); more on this later.

10.3.1 The aes vignette

Execute ??aes to view the options for graphing aesthetics and select the “Aesthetic specifications” vignette.

This help page will go over how to alter the appearance of:

  • Color and fill of various objects

  • Lines size/type

  • Point shape, color, and fill

  • Text size, font, font face, justification

There is an abundance of resources to help you graph using the ggplot2 package. This guide will go over a select few that are frequently used.

10.3.2 Color

Let’s change the graph so that each cut is differentiated by color:

diamonds %>% 
  group_by(clarity, cut) %>% 
  summarize(m = mean(price),
            se = sem(price)) %>% 
  ggplot(aes(x = clarity, y = m, group = cut, color = cut)) +
  geom_point() +
  geom_errorbar(aes(ymin = m - se, ymax = m + se)) +
  geom_line()

10.3.3 Coloring Structural Elements

There is a more specific method of changing the color aesthetics of the graph. Rather than changing the entire color scheme of the graph (points, error bars, connecting lines), you can elect to change one part. This allows for some customization.

Don’t forget to make sure the sem function is defined in your environment before following along the examples in this section!

sem <- function(x, na.rm = FALSE) {
  out <- sd(x, na.rm = na.rm)/sqrt(length(x))
  return(out)}

Global Color Changes

diamonds %>% 
  group_by(clarity, cut) %>% 
  summarize(m = mean(price),
            se = sem(price)) %>% 
  ggplot(aes(x = clarity, 
             y = m, 
             group = cut,  
             color = cut)) +
  geom_point() +
  geom_errorbar(aes(ymin = m - se, 
                    ymax = m + se)) +
  geom_line()

Changing the Color of Data Points Only

diamonds %>% 
  group_by(clarity, cut) %>% 
  summarize(m = mean(price),
            se = sem(price)) %>% 
  ggplot(aes(x = clarity, 
             y = m, 
             group = cut)) +
  geom_point(aes(color = cut)) +
  geom_errorbar(aes(ymin = m - se, 
                    ymax = m + se)) +
  geom_line()

Changing the Color of Error Bars Only

diamonds %>% 
  group_by(clarity, cut) %>% 
  summarize(m = mean(price),
            se = sem(price)) %>% 
  ggplot(aes(x = clarity, 
             y = m, 
             group = cut)) +
  geom_point() +
  geom_errorbar(aes(ymin = m - se, 
                    ymax = m + se,        
                    color = cut)) +
  geom_line()

Changing the Color of Connecting Lines Only

diamonds %>% 
  group_by(clarity, cut) %>% 
  summarize(m = mean(price),
            se = sem(price)) %>% 
  ggplot(aes(x = clarity, 
             y = m, 
             group = cut)) +
  geom_point() +
  geom_errorbar(aes(ymin = m - se, 
                    ymax = m + se)) +
  geom_line(aes(color = cut))

10.3.4 Shape

We could also opt to change the shape of each cut category for further distinction. Notice how each cut category is now represented by a different symbol:

diamonds %>% 
  group_by(clarity, cut) %>% 
  summarize(m = mean(price),
            se = sem(price)) %>% 
  ggplot(aes(x = clarity, y = m, group = cut, color = cut, shape = cut)) +
  geom_point() +
  geom_errorbar(aes(ymin = m - se, ymax = m + se)) +
  geom_line()
## Warning: Using shapes for an ordinal variable is not advised

There will be a warning message stating that using shapes for an ordinal variable is not advised. R will occasionally advise you on your code. In some cases, these are error messages that prevent your code from working. In other cases (such as this), there are warning messages that don’t prevent your code execution but does provide suggestions.

10.3.5 Line Type

You can differentiate the cut categories by different line types, though this may not be very useful in this particular situation:

diamonds %>% 
  group_by(clarity, cut) %>% 
  summarize(m = mean(price),
            se = sem(price)) %>% 
  ggplot(aes(x = clarity, y = m, group = cut, color = cut, linetype = cut)) +
  geom_point() +
  geom_errorbar(aes(ymin = m - se, ymax = m + se)) +
  geom_line()

Remember that in order to use the sem() function in the example above, it must be defined in the environment using:

sem <- function(x, na.rm = FALSE) {
  out <- sd(x, na.rm = na.rm)/sqrt(length(x))
  return(out)}

10.3.6 Inside vs. Outside aes()

Some aesthetics need to be nested within aes() and some do not. How do we know when to place them outside versus inside?

  • Aesthetics for a specific variable of your data go inside aes().

  • An aesthetic that will remain a constant value – irrespective of the values of your data – should be placed outside of aes() and within the geom element.

As an example, let’s focus on the color argument in ggplot().

If you were to place the color argument inside of aes() as such:

diamonds %>% 
  group_by(clarity, cut) %>% 
  summarize(m = mean(price)) %>% 
  ggplot(aes(x = clarity, y = m, group = cut, color = cut)) +
  geom_point() 

If you were to place the color argument outside of aes(), such as:

diamonds %>% 
  group_by(clarity, cut) %>% 
  summarize(m = mean(price)) %>% 
  ggplot(aes(x = clarity, y = m, group = cut), color = cut) +
  geom_point()

In this example, setting the color argument to a variable, cut, requires that the color argument be nested inside aes(). This is because the color argument in ggplot() is referring to coloring a particular variable, not just the data points (geom_point()) overall.

In order to specify that we want all data points displayed in a single color, we could execute:

diamonds %>% 
  group_by(clarity, cut) %>% 
  summarize(m = mean(price)) %>% 
  ggplot(aes(x = clarity, y = m, group = cut)) +
  geom_point(color = "purple")

Notice that in order to specify a single color for all points, you must place the color argument outside of aes() under the specific geom element. Here, the color of the data points does not consider each cut category – the color is the same across all cuts (Fair, Good, Very Good, etc.).

When you place a color argument inside aes(), R recognizes the argument’s value as a variable.

diamonds %>% 
  group_by(clarity, cut) %>% 
  summarize(m = mean(price)) %>% 
  ggplot(aes(x = clarity, y = m, group = cut)) +
  geom_point(aes(color = "purple")) 

OR

diamonds %>% 
  group_by(clarity, cut) %>% 
  summarize(m = mean(price)) %>% 
  ggplot(aes(x = clarity, y = m, group = cut, color = "purple")) +
  geom_point() 

In the above examples, R thinks that “purple”" represents a variable name (notice the legend). We know that the user is attempting to color the data points purple, but R does not logically evaluate the code as such. R has a predetermined method in which it determines how code is read and will not “fill in the blanks” and assume the user intended actions.

Though “purple”" is not an existing variable name in the diamonds dataset, R still recognizes that the code is attempting to change the color. As a result, R will default to the color red.