Chapter 6 Grammar for visualization

Let’s start with some practical plotting

6.1 Excercise 1 Fill in the blank spaces

First read in data and then fill in the blank spaces.

library(tidyverse)

kunder <- read_csv("path-to-data")

ggplot(kunder, aes(x = column-for-price-plan)) +
    geom_bar() +
    coord_flip() +
    labs(title = "Title of the plot",
                        x = "Title for X", "Title for Y")

Like this:

library(tidyverse)

kunder <- read_csv("data/tele2-kunder.csv")

## Parsed with column specification:
## cols(
##   .default = col_double(),
##   source_date = col_datetime(format = ""),
##   ar_key = col_character(),
##   cust_id = col_character(),
##   pc_l3_pd_spec_nm = col_character(),
##   cpe_type = col_character(),
##   cpe_net_type_cmpt = col_character(),
##   pc_priceplan_nm = col_character(),
##   sc_l5_sales_cnl = col_character(),
##   rt_fst_cstatus_act_dt = col_datetime(format = ""),
##   rrpu_amt_used = col_character(),
##   rcm1pu_amt_used = col_character()
## )

## See spec(...) for full column specifications.

ggplot(kunder, aes(x = pc_priceplan_nm)) +
    geom_bar() +
    coord_flip() +
    labs(title = "Number of customers per priceplan",
                        x = "Priceplan", y = "Number of customers")

6.1.1 Excercise 2

Map one of the tr_tot_data... columns to x

ggplot(kunder, aes(x = any-of-the-data-vol-months)) +
    geom_density()

ggplot(kunder, aes(x = tr_tot_data_vol_all_netw_7)) +
    geom_density()

## Warning: Removed 4 rows containing non-finite values (stat_density).

6.2 Let’s look at this visualization, what’s in it?

library(gapminder)
library(ggplot2)
library(gganimate)

p <- gapminder %>%
  filter(year == 1972) %>% 
        ggplot(aes(gdpPercap, lifeExp, size = pop, colour = continent)) +
  geom_point(alpha = 0.7) +
  scale_size(range = c(2, 12)) +
  scale_x_log10() +
  labs(title = 'Year: 1972', x = 'GDP per capita', y = 'life expectancy')

p

Data
Aestetics
Geometrical objects
Scales
Statistics
Facets

##Data

Firstly, we have data

library(gapminder)
library(ggplot2)
library(gganimate)

head(gapminder)

## # A tibble: 6 x 6
##   country     continent  year lifeExp      pop gdpPercap
##   <fct>       <fct>     <int>   <dbl>    <int>     <dbl>
## 1 Afghanistan Asia       1952    28.8  8425333      779.
## 2 Afghanistan Asia       1957    30.3  9240934      821.
## 3 Afghanistan Asia       1962    32.0 10267083      853.
## 4 Afghanistan Asia       1967    34.0 11537966      836.
## 5 Afghanistan Asia       1972    36.1 13079460      740.
## 6 Afghanistan Asia       1977    38.4 14880372      786.

Note that we got tidy data, each observations is a row and every column is a variable.

Data is the first argument in ggplot()

ggplot(data = data)

When we filter data

data <- gapminder %>%
  filter(year == 1972)

p <- ggplot(data = data)

6.3 aestsetics

In order for the visualization to work we need to specify the mapping in the data.
We map aestetics such as x and y or color and size

6.3.1 Excercise 3

map x to gdpPercap and y to lifeExp

ggplot(data = data, mapping = aes(x = ..., y = ...))

ggplot(data = data, mapping = aes(x = gdpPercap, y = lifeExp))

6.4 Geometrical objects

We need a geometrical object to represent the mapping and the data such as lines, bars or points.

ggplot(data = data, mapping = aes(x = gdpPercap, y = lifeExp)) +
  geom_point() #<<

We can manipulate the objects by adding more aesthetics

6.4.1 Excercise 4

Map size to pop

p <- ggplot(data = data, mapping = aes(x = gdpPercap, y = lifeExp,
                                  size = ...)) + #<<
  geom_point()

p <- ggplot(data = data, mapping = aes(x = gdpPercap, y = lifeExp,
                                  size = pop)) + #<<
  geom_point()

6.4.2 Excercise 5

Map color to continent

p <- ggplot(data = data, mapping = aes(x = gdpPercap,
                                       y = lifeExp,
                                       size = pop,
                                       color = ...)) + #<<
  geom_point()

Like this:

ggplot(data = data, mapping = aes(x = gdpPercap,
                                       y = lifeExp,
                                       size = pop,
                                       color = continent)) + #<<
  geom_point()

6.5 Scale

Per default ggplot() uses the following functions scale_y_continuous() and scale_x_continuous() when x and y are numerical.

p <- ggplot(data = data, mapping = aes(x = gdpPercap,
                                       y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  scale_y_continuous() + #<<
  scale_x_continuous() #<<

Scale-functions have arguments
Sometimes we want to show the y axis from 0 to max

p <- ggplot(data = data, mapping = aes(x = gdpPercap,
                                       y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  scale_y_continuous(limits = c(0, 80)) + #<<
  scale_x_continuous()

6.5.1 Excercise 6

Not only the axis scale can be manipulated
Change the color scale to scale_color_viridis_d()

p <- ggplot(data = data, mapping = aes(x = gdpPercap,
                                       y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  scale_y_continuous(limits = c(0, 80)) + 
  scale_x_continuous() +
     scale_color_...() #<<

Like this:

ggplot(data = data, mapping = aes(x = gdpPercap,
                                       y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  scale_y_continuous(limits = c(0, 80)) + 
  scale_x_continuous() +
     scale_color_viridis_d()

6.5.2 Scales are important

Working with log-normal data

ggplot(data, aes(x = gdpPercap)) +
  geom_density()

p <- ggplot(data, aes(x = gdpPercap)) +
  geom_density() +
  scale_x_log10() #<<

6.5.3 Excercise 7

Change the x scale to scale_x_log10()

p <- ggplot(data = data, mapping = aes(x = gdpPercap,
                                  y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  scale_y_continuous() +
  scale_...

Like this:

ggplot(data = data, mapping = aes(x = gdpPercap,
                                  y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  scale_y_continuous() +
     scale_x_log10()

6.6 Statistical features

Often we want to complement the plot with a statistical calculation, for instance a trend line.

p <- ggplot(data = data, mapping = aes(x = gdpPercap,
                                  y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  scale_x_log10() +
  stat_smooth(method = "lm") #<<

## Warning in qt((1 - level)/2, df): NaNs produced

We can add calculations with stat_...(), but we can specify our own models:

p <- ggplot(data = data, mapping = aes(x = gdpPercap,
                                  y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  scale_x_log10() +
  stat_smooth(method = "lm", formula = y ~ 1)

Or use non-linear trends

p <- ggplot(gapminder, mapping = aes(x = year,
                                  y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  stat_smooth(method = "loess") #<<

If we want to calculate for the whole population we simply move size and color out from the main statement.

p <- ggplot(data = data, mapping = aes(x = gdpPercap, y = lifeExp)) +
  geom_point(aes(size = pop, color = continent)) + #<<
  scale_x_log10() +
  stat_smooth(method = "lm")

6.7 Facets

One common task is to visualize groups in sub plots but in the same graph. This is one of the most powerful features in ggplot2.
We do this with facets

p <- ggplot(data = data, mapping = aes(x = gdpPercap,
                                  y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  scale_x_log10() +
  stat_smooth(method = "lm") +
  facet_wrap(~continent) #<<

## Warning in qt((1 - level)/2, df): NaNs produced

6.7.1 Excercise 8

We can specify the scales as “free” or “fixed”.
Change scales to “free”

p <- ggplot(data = data, mapping = aes(x = gdpPercap,
                                  y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  scale_x_log10() +
  stat_smooth(method = "lm") +
  facet_wrap(~continent, scales = "fixed") #<<

ggplot(data = data, mapping = aes(x = gdpPercap,
                                  y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  scale_x_log10() +
  stat_smooth(method = "lm") +
  facet_wrap(~continent, scales = "free")

## Warning in qt((1 - level)/2, df): NaNs produced

6.8 Coordinate system

The last component in the plot is coordinate system - Usually cartesian

Most useful when we want to flip the plot

p <- ggplot(data, aes(x = continent)) +
    geom_bar() +
    coord_flip() #<<

6.9 Let’s get practical

6.9.1 Excercise 9

Visualize the relationship between the last two months of data volume.

Which are you aestetics?
Which geom do you use?
Is the scale suitable?
Can you use any statistical computation to visualize the relationship?

ggplot(data, aes(x = x, y = y)) +
  geom_... +
  stat_... +
     labs(title = "", x = "", y = "")

p <- ggplot(kunder, aes(x = tr_tot_data_vol_all_netw_1, y = tr_tot_data_vol_all_netw_2)) +
    geom_point() +
    scale_y_log10() +
    scale_x_log10()

## Warning: Removed 4 rows containing missing values (geom_point).

6.9.2 Excercise 10

Add aestetics

Such as fill, color or size

p <- ggplot(kunder, aes(x = tr_tot_data_vol_all_netw_1,
                                                                                                y = tr_tot_data_vol_all_netw_2,
                                                                                                color = cpe_type,
                                                                                                size = alloc_rrpu_amt)) +
    geom_point() +
    scale_y_log10() +
    scale_x_log10()

## Warning: Removed 4 rows containing missing values (geom_point).

6.10 Geoms to visualize one variable

geom_bar() and geom_col() for bar charts
geom_histogram()
geom_density() for distributions
geom_boxplot()
geom_violin() for so called violin plots

ggplot(gapminder, aes(x = lifeExp)) +
  geom_histogram(bins = 30) #<<

ggplot(gapminder, aes(x = lifeExp)) +
  geom_density() #<<

ggplot(gapminder, aes(x = lifeExp, y = lifeExp)) +
  geom_boxplot() #<<

## Warning: Continuous x aesthetic -- did you forget aes(group=...)?

ggplot(gapminder, aes(x = lifeExp, y = lifeExp)) +
  geom_violin() #<<

6.11 Groups

map groups to x to have multiple groups

p <- ggplot(gapminder, aes(x = continent, #<<
                      y = lifeExp)) +
  geom_violin()

6.12 Using `dplyr` with `ggplot2` part 1

You can pipe dplyr with ggplot with %>%

p <- gapminder %>% #<<
  filter(country %in% c("Sweden", "Norway", "Denmark", "Finland")) %>% #<<
  ggplot(aes(x = country, y = lifeExp)) +
  geom_violin()

6.12.1 Excercise 11

Vizualise calculated revenue per month (alloc_rrpu_amt)

Use %>%
Which geom best represents the data?
Is the distribution normal?
Are there any outliers? How do you filter these out?
Is the scale suitable?

p <- kunder %>% 
    filter(alloc_rrpu_amt < 1000) %>% 
ggplot(aes(x = alloc_rrpu_amt)) +
    geom_density()

6.13 Use `dplyr` with `ggplot2` part 2

We can use dplyr to calculate summary statistics and then visualize it with ggplot.

p <- gapminder %>%
  group_by(continent) %>%
  summarize(mean_lifeExp = mean(lifeExp)) %>%
  ggplot(aes(x = continent, y = mean_lifeExp)) +
  geom_col() #<<

What’s the difference between geom_bar() and geom_col?

geom_bar() How many obsevations
geom_col() Values in summary tables

ggplot(gapminder, aes(x = continent)) +
  geom_bar() #<<

6.14 Changing the order on the axis

Function reorder(variable, by = variable)

p <- gapminder %>%
  group_by(continent) %>%
  summarize(mean_lifeExp = mean(lifeExp)) %>%
  ggplot(aes(x = reorder(continent, -mean_lifeExp), #<<
             y = mean_lifeExp)) +
  geom_col()

6.15 Flipped plots

p <- gapminder %>%
  group_by(continent) %>%
  summarize(mean_lifeExp = mean(lifeExp)) %>%
  ggplot(aes(x = reorder(continent, mean_lifeExp), y = mean_lifeExp)) + #<<
  geom_col() +
  coord_flip()

6.15.1 Excercise 12

Inspect the categories of pc_priceplan_nm, are there any natural categories they could be assigned to?
Create a new column that groups priceplan into three or four categories, use case_when() and str_detect()
Create a summary table of revenue per month per priceplan category (that you previously created)
Visualize it with an adequate geom
Reorder the plot in ascending order
flip the coord if you get a problem with axis
Style your plot with a theme and labels

kunder_sample <- read_csv("data/tele2-kunder-sample.csv")

## Parsed with column specification:
## cols(
##   .default = col_double(),
##   source_date = col_date(format = ""),
##   ar_key = col_character(),
##   cust_id = col_character(),
##   pc_l3_pd_spec_nm = col_character(),
##   cpe_type = col_character(),
##   cpe_net_type_cmpt = col_character(),
##   pc_priceplan_nm = col_character(),
##   sc_l5_sales_cnl = col_character(),
##   rt_fst_cstatus_act_dt = col_date(format = ""),
##   rrpu_amt_used = col_character(),
##   rcm1pu_amt_used = col_character()
## )

## See spec(...) for full column specifications.

kunder_sample %>% 
    group_by(pc_priceplan_nm) %>% 
    count(sort = T)

## # A tibble: 137 x 2
## # Groups:   pc_priceplan_nm [137]
##    pc_priceplan_nm                n
##    <chr>                      <int>
##  1 Fast pris                  44578
##  2 Fast pris +EU 8 GB - 2018  20011
##  3 Unlimited - 2018           15887
##  4 Fast pris +EU 3 GB - 2018  14906
##  5 Fast pris +EU 15 GB - 2018 13686
##  6 Bredband 4G                13218
##  7 Fast pris +EU 1 GB         13017
##  8 Fast pris +EU 15 GB        12634
##  9 Fast pris +EU 5 GB         12610
## 10 Rörligt pris               12273
## # … with 127 more rows

kunder_sample <- kunder_sample %>% 
    mutate(priceplan_cat = case_when(
        str_detect(tolower(pc_priceplan_nm), "bredband") ~ "Bredband",
        str_detect(tolower(pc_priceplan_nm), "fast") ~ "Fast pris",
        TRUE ~ "Rörligt/övrigt")
        )

p <- kunder_sample %>% 
    group_by(priceplan_cat) %>% 
    summarise(median_rev = median(alloc_rrpu_amt, na.rm = T)) %>% 
    ggplot(aes(x = reorder(priceplan_cat, median_rev),  y = median_rev)) +
    geom_col() +
    coord_flip() +
    theme_minimal() +
    labs(title = "Revenue per priceplan",
                        x = "Priceplan", y = "Median revenue")

6.16 Annotations and styling

It is fairly easy to create color palettes and custome themes
Example: Coop

library(crmutils)
p <- kunder %>% 
    group_by(cpe_type) %>% 
    summarise(log_mean_rev = log(mean(alloc_rrpu_amt, na.rm = T))) %>% 
    ggplot(aes(x = reorder(cpe_type, log_mean_rev),  y = log_mean_rev, fill = cpe_type)) +
    geom_col() +
    scale_fill_coop() + #<<
    coord_flip() +
    theme_minimal() +
    labs(title = "Revenue per priceplan",
                        x = "CPE Type", y = "Log mean revenue") +
    geom_label(aes(label = round(log_mean_rev, 1)), fill = "white") + #<<
    theme(legend.position = "none") #<<

6.17 Interactivity

Multiple libraries for interactive, web-based, plots
plotly, highcharter, r2d3 and more

Plotly:

library(plotly)
ggplotly(p)

## Warning in geom2trace.default(dots[[1L]][[1L]], dots[[2L]][[1L]], dots[[3L]][[1L]]): geom_GeomLabel() has yet to be implemented in plotly.
##   If you'd like to see this geom implemented,
##   Please open an issue with your example code at
##   https://github.com/ropensci/plotly/issues

Plotly also have an own syntax for custom graphs - highly recommended for interactive applications in Shiny.

6.17.1 Excericse 13

Make some of your plots interactive with ggplotly()

6.18 Time over?

Test as many geoms as you can
Try out plotlys own syntax for creating graphs, can be found here: https://plot.ly/r/line-and-scatter/

6.19 Exporting plots

There are several ways you can export visualizations.

The easiest way is to use ggsave()

Reading the documentation on ggsave() how can you use it?

You can also export visualizations to PowerPoint if using the package officer.

You can read more about officer here: https://davidgohel.github.io/officer/articles/powerpoint.html

library(officer)

## 
## Attaching package: 'officer'

## The following object is masked from 'package:readxl':
## 
##     read_xlsx

p <- gapminder %>% 
    filter(year == 1972) %>% 
    ggplot(mapping = aes(x = gdpPercap,
                                                                                        y = lifeExp,
                                                                                        size = pop,
                                                                                        color = continent)) +
    geom_point() +
    scale_y_continuous(limits = c(0, 80)) + #<<
    scale_x_log10() +
    theme_minimal() +
    labs(title = )

p

This way you can automate powerpoint-presentations.

doc <- read_pptx()
doc <- add_slide(doc, layout = "Title and Content", master = "Office Theme") %>% 
    ph_with(value = "Let's get rich...", location = ph_location_type(type = "title")) %>% 
    ph_with(value = p, location = ph_location_type(type = "body"))z

print(doc, target = "ph_with_gg.pptx")

6.19.1 .15

Create your own powerpoint-presentation!

You may also refer to another powerpoint-presentation in read_pptx(path = "documents/report_q2.pptx") to use it as a template. Feel free to try this out on your own computer (i.e. not in the cloud environment), read more about it on the officer-website.

6.20 Answers to excercies visualization

6.21 Excercise 1.

## Parsed with column specification:
## cols(
##   .default = col_double(),
##   source_date = col_datetime(format = ""),
##   ar_key = col_character(),
##   cust_id = col_character(),
##   pc_l3_pd_spec_nm = col_character(),
##   cpe_type = col_character(),
##   cpe_net_type_cmpt = col_character(),
##   pc_priceplan_nm = col_character(),
##   sc_l5_sales_cnl = col_character(),
##   rt_fst_cstatus_act_dt = col_datetime(format = ""),
##   rrpu_amt_used = col_character(),
##   rcm1pu_amt_used = col_character()
## )

## See spec(...) for full column specifications.

6.22 Excercise 2.

Let’s make a distribution plot

ggplot(kunder, aes(x = tr_tot_data_vol_all_netw_7)) +
    geom_density()

## Warning: Removed 4 rows containing non-finite values (stat_density).

6.23 Excercise 3.

map x to gdpPercap and y to lifeExp

library(gapminder)
data <- gapminder %>%
  filter(year == 1972)

ggplot(data = data, mapping = aes(x = gdpPercap, y = lifeExp))

6.24 Excercise 4.

Map size to pop

ggplot(data = data, mapping = aes(x = gdpPercap, y = lifeExp,
                                  size = pop)) + #<<
  geom_point()

6.25 Excercise 5.

Map color to continent

ggplot(data = data, mapping = aes(x = gdpPercap,
                                       y = lifeExp,
                                       size = pop,
                                       color = continent)) + #<<
  geom_point()

6.26 Excercise 6.

Change the color scale to scale_color_viridis_d()

ggplot(data = data, mapping = aes(x = gdpPercap,
                                       y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  scale_y_continuous(limits = c(0, 80)) + 
  scale_x_continuous() +
     scale_color_viridis_d() #<<

6.27 Excercise 7.

Change the x scale to scale_x_log10()

ggplot(data = data, mapping = aes(x = gdpPercap,
                                  y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  scale_y_continuous() +
     scale_x_log10()

6.28 Excercise 8.

We can specify the scales as “free” or “fixed”.
Change scales to “free”

ggplot(data = data, mapping = aes(x = gdpPercap,
                                  y = lifeExp,
                                  size = pop,
                                  color = continent)) +
  geom_point() +
  scale_x_log10() +
  stat_smooth(method = "lm") +
  facet_wrap(~continent, scales = "free") #<<

## Warning in qt((1 - level)/2, df): NaNs produced

6.29 Excercise 9.

Visualize the relationship between the last two months of data volume.

Which are you aestetics?
Which geom do you use?
Is the scale suitable?
Can you use any statistical computation to visualize the relationship?

ggplot(kunder, aes(x = tr_tot_data_vol_all_netw_1, y = tr_tot_data_vol_all_netw_2)) +
  geom_point() +
  stat_smooth(method = "lm") +
     labs(title = "Relationship", x = "Tot vol february", y = "Tot vol march")

6.30 Excercise 10.

Add aestetics

Such as fill, color or size

ggplot(kunder, aes(x = tr_tot_data_vol_all_netw_1,
                                                                                                y = tr_tot_data_vol_all_netw_2,
                                                                                                color = cpe_type,
                                                                                                size = alloc_rrpu_amt)) +
    geom_point() +
    scale_y_log10() +
    scale_x_log10()

## Warning: Removed 4 rows containing missing values (geom_point).

6.31 Excercise 11.

Vizualise calculated revenue per month (alloc_rrpu_amt)

Use %>%
Which geom best represents the data?
Is the distribution normal?
Are there any outliers? How do you filter these out?
Is the scale suitable?

kunder_sample <- read_csv("data/tele2-kunder-sample.csv")

## Parsed with column specification:
## cols(
##   .default = col_double(),
##   source_date = col_date(format = ""),
##   ar_key = col_character(),
##   cust_id = col_character(),
##   pc_l3_pd_spec_nm = col_character(),
##   cpe_type = col_character(),
##   cpe_net_type_cmpt = col_character(),
##   pc_priceplan_nm = col_character(),
##   sc_l5_sales_cnl = col_character(),
##   rt_fst_cstatus_act_dt = col_date(format = ""),
##   rrpu_amt_used = col_character(),
##   rcm1pu_amt_used = col_character()
## )

## See spec(...) for full column specifications.

kunder_sample %>% 
    filter(alloc_rrpu_amt < 1000) %>% 
ggplot(aes(x = alloc_rrpu_amt)) +
    geom_histogram()

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

6.32 Excercise 12.

Inspect the categories of pc_priceplan_nm, are there any natural categories they could be assigned to?

kunder_sample %>% 
    group_by(pc_priceplan_nm) %>% 
    count(sort = T)

## # A tibble: 137 x 2
## # Groups:   pc_priceplan_nm [137]
##    pc_priceplan_nm                n
##    <chr>                      <int>
##  1 Fast pris                  44578
##  2 Fast pris +EU 8 GB - 2018  20011
##  3 Unlimited - 2018           15887
##  4 Fast pris +EU 3 GB - 2018  14906
##  5 Fast pris +EU 15 GB - 2018 13686
##  6 Bredband 4G                13218
##  7 Fast pris +EU 1 GB         13017
##  8 Fast pris +EU 15 GB        12634
##  9 Fast pris +EU 5 GB         12610
## 10 Rörligt pris               12273
## # … with 127 more rows

Create a new column that groups priceplan into three or four categories

kunder_sample <- kunder_sample %>% 
    mutate(priceplan_cat = case_when(
        str_detect(tolower(pc_priceplan_nm), "bredband") ~ "Bredband",
        str_detect(tolower(pc_priceplan_nm), "fast") ~ "Fast pris",
        TRUE ~ "Rörligt/övrigt")
        )

Create a summary table of revenue per month per priceplan category (that you previously created)
Visualize it with an adequate geom
Reorder the plot in ascending order
flip the coord if you get a problem with axis
Style your plot with a theme and labels

p <- kunder_sample %>% 
    group_by(priceplan_cat) %>% 
    summarise(median_rev = median(alloc_rrpu_amt, na.rm = T)) %>% 
    ggplot(aes(x = reorder(priceplan_cat, median_rev),  y = median_rev)) +
    geom_col() +
    coord_flip() +
    theme_minimal() +
    labs(title = "Revenue per priceplan",
                        x = "Priceplan", y = "Median revenue")

p

6.33 Excercise 13.

Make some of your plots interactive with ggplotly()

library(plotly)
ggplotly(p)

6.34 Excercise 14.

Add annotations, labels and other styling to your plot

6.35 Example: Annotations and styling

It is fairly easy to create color palettes and custome themes
Example: Coop

kunder %>% 
    group_by(cpe_type) %>% 
    summarise(log_mean_rev = log(mean(alloc_rrpu_amt, na.rm = T))) %>% 
    ggplot(aes(x = reorder(cpe_type, log_mean_rev),  y = log_mean_rev, fill = cpe_type)) +
    geom_col() +
    coord_flip() +
    theme_minimal() +
    labs(title = "Revenue per priceplan",
                        x = "CPE Type", y = "Log mean revenue") +
    geom_label(aes(label = round(log_mean_rev, 1)), fill = "white") + #<<
    theme(legend.position = "none") #<<

Chapter 6 Grammar for visualization

6.1 Excercise 1 Fill in the blank spaces

6.1.1 Excercise 2

6.2 Let’s look at this visualization, what’s in it?

6.3 aestsetics

6.3.1 Excercise 3

6.4 Geometrical objects

6.4.1 Excercise 4

6.4.2 Excercise 5

6.5 Scale

6.5.1 Excercise 6

6.5.2 Scales are important

6.5.3 Excercise 7

6.6 Statistical features

6.7 Facets

6.7.1 Excercise 8

6.8 Coordinate system

6.9 Let’s get practical

6.9.1 Excercise 9

6.9.2 Excercise 10

6.10 Geoms to visualize one variable

6.11 Groups

6.12 Using dplyr with ggplot2 part 1

6.12.1 Excercise 11

6.13 Use dplyr with ggplot2 part 2

6.14 Changing the order on the axis

6.15 Flipped plots

6.15.1 Excercise 12

6.16 Annotations and styling

6.17 Interactivity

6.17.1 Excericse 13

6.18 Time over?

6.19 Exporting plots

6.19.1 .15

6.20 Answers to excercies visualization

6.21 Excercise 1.

6.22 Excercise 2.

6.23 Excercise 3.

6.24 Excercise 4.

6.25 Excercise 5.

6.26 Excercise 6.

6.27 Excercise 7.

6.28 Excercise 8.

6.29 Excercise 9.

6.30 Excercise 10.

6.31 Excercise 11.

6.32 Excercise 12.

6.33 Excercise 13.

6.34 Excercise 14.

6.35 Example: Annotations and styling

6.12 Using `dplyr` with `ggplot2` part 1

6.13 Use `dplyr` with `ggplot2` part 2