## 5.9 Time: Line charts & events

### 5.9.1 Data & Packages & functions

• Line and path plots typically used for time series data (see Appendix Line vs. path plots)
• Line plots (geom_line()): join the points from left to right
• Have time on the x-axis, showing how a single variable has changed over time
• Path plots (geom_path()): join them in the order that they appear in the dataset (in other words, a line plot is a path plot of the data sorted by x value)
• Below we’ll also use gtrends() from the gtrendsR package to obtain search frequencies.
• And we’ll use pivot_wider() from the tidyr package.
• ggnewscale: Can be used to reset a scale if we want to generate several legends
• And we’ll use scale modification to show proper legends in line plots

### 5.9.2 Graph

• Here we’ll reproduce Figure 5.15 (but with ggplot2) (Bauer et al. 2020)
• Questions:
• What does the graph show? What are the underlying variables (and data)?
• How many scales/mappings does it use? Could we reduce them?
• What do you like, what do you dislike about the figure? What is good, what is bad?
• What kind of information could we add to the graph (if any)?
• How would you approach a replication of the graph? Figure 5.15: Lines (trends) and events

### 5.9.3 Lab: Data & code

• Learning objectives
• How to plot dates
• How to make line plots
• How to create manual legends for various elements
• How to visualize events

We’ll start by preparing the data.

# Words to search for
search.words <- c("GDPR", "DSGVO")

gprop = "web",
time = "2018-03-01 2018-11-16",
geo = "DE")[]

pivot_wider(names_from = c("keyword", "geo"), values_from = "hits") %>%
dplyr::select(-time, -gprop, -category)

# Replace "<1" with 0
mutate_all(funs(str_replace(., "<1", "0")))
## Warning: funs() is soft deprecated as of dplyr 0.8.0
## Please use a list of either functions or lambdas:
##
##   # Simple named list:
##   list(mean = mean, median = median)
##
##   # Auto named with tibble::lst():
##   tibble::lst(mean, median)
##
##   # Using lambdas
##   list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
## This warning is displayed once per session.
# Convert date variable
google.trends$date <- as.Date(google.trends$date, "%Y-%m-%d")

# Mutate factor to numeric and reorder
mutate_if(is.factor, as.character) %>%
mutate_if(is.character, as.numeric)

head(google.trends)
date GDPR_DE DSGVO_DE
2018-03-01 2 4
2018-03-02 1 3
2018-03-03 0 1
2018-03-04 0 1
2018-03-05 1 4
2018-03-06 1 5

1. We plot the data an add our own annotations in Figure 5.16. Let’s go through the code together.
ggplot(data = google.trends) +
geom_rect(aes(fill = "fieldperiod"),
xmin = as.Date("2018-04-16", "%Y-%m-%d"),
xmax = as.Date("2018-04-23", "%Y-%m-%d"),
ymin = 0, ymax = 100, alpha = 0.2) +
geom_rect(aes(fill = "fieldperiod"),
xmin = as.Date("2018-07-24", "%Y-%m-%d"),
xmax = as.Date("2018-08-02", "%Y-%m-%d"),
ymin = 0, ymax = 100, alpha = 0.2) +
geom_rect(aes(fill = "fieldperiod"),
xmin = as.Date("2018-10-29", "%Y-%m-%d"),
xmax = as.Date("2018-11-07", "%Y-%m-%d"),
ymin = 0, ymax = 100, alpha = 0.2) +
geom_line(aes(x = date, y = DSGVO_DE, color = "dsgvocolor")) +
geom_line(aes(x = date, y = GDPR_DE, color = "gdprcolor")) +
theme_light() +
ylab("Searches (100 = max. interest in time period/territory)") +
xlab("Month (2018)") +
dsgvocolor = "black",
Policy_implementation = "red"),
labels = c("GDPR Searches",
"DSVGO Searches",
"Policy implementation (25th of May)")) +
scale_fill_manual(name="Field periods",
values=c(fieldperiod="gray"),
labels = c("Wave 1, 2 and 3")) +
new_scale_color() +
scale_colour_manual(name="Events",
values=c(Policy_implementation = "red"),
labels = c("Policy implementation (25th of May)")) +
geom_vline(aes(xintercept = as.Date("2018-05-25"), color = "Policy_implementation")) +
theme(
legend.position = c(.95, .95),
legend.justification = c("right", "top"),
legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6),
legend.background = element_rect(fill=alpha('white', 0.8))) Figure 5.16: Lines (trends) and events

### 5.9.4 Exercise

1. Use the code from above and investigate Google searches for two other topics (e.g. “COVID” and “Hydroxychloroquine”). Choose a sensible time period for your search. And choose a sensible geographic area (e.g., geo = "US").
2. Convert the data into longformat etc. (following the steps above) so that you can visualize it as a lineplot in ggplot.
3. Add events to your lineplots (e.g., one could take one of Trump’s tweets as an event).
4. Try to visualize a legend (it’s challenging!).

### References

Bauer, Paul C, Frederic Gerdon, Florian Keusch, and Frauke Kreuter. 2020. “The Impact of the GDPR Policy on Data Sharing/Privacy Attitudes.” Preliminary Draft, 1–22.