第 13 章 ggplot2之几何对象

采菊东篱下,悠然见南山。

根据大家投票,觉得ggplot2是最想掌握的技能,我想这就是R语言中最有质感的部分吧。所以,这里专门拿出一节课讲ggplot2,也算是补上之前第 7 章数据可视化没讲的内容。

library(tidyverse)

13.1 一个有趣的案例

先看一组数据

df <- read_csv("./demo_data/datasaurus.csv")
df
## # A tibble: 1,846 x 3
##    dataset     x     y
##    <chr>   <dbl> <dbl>
##  1 dino     55.4  97.2
##  2 dino     51.5  96.0
##  3 dino     46.2  94.5
##  4 dino     42.8  91.4
##  5 dino     40.8  88.3
##  6 dino     38.7  84.9
##  7 dino     35.6  79.9
##  8 dino     33.1  77.6
##  9 dino     29.0  74.5
## 10 dino     26.2  71.4
## # ... with 1,836 more rows

先用dataset分组后,然后计算每组下x的均值和方差,y的均值和方差,以及x,y两者的相关系数,我们发现每组数据下它们几乎都是相等的

df %>%
  group_by(dataset) %>%
  summarise(
    across(everything(), list(mean = mean, sd = sd), .names = "{fn}_{col}")
  ) %>%
  mutate(
    across(is.numeric, round, 3)
  )
## # A tibble: 13 x 5
##    dataset    mean_x  sd_x mean_y  sd_y
##    <chr>       <dbl> <dbl>  <dbl> <dbl>
##  1 away         54.3  16.8   47.8  26.9
##  2 bullseye     54.3  16.8   47.8  26.9
##  3 circle       54.3  16.8   47.8  26.9
##  4 dino         54.3  16.8   47.8  26.9
##  5 dots         54.3  16.8   47.8  26.9
##  6 h_lines      54.3  16.8   47.8  26.9
##  7 high_lines   54.3  16.8   47.8  26.9
##  8 slant_down   54.3  16.8   47.8  26.9
##  9 slant_up     54.3  16.8   47.8  26.9
## 10 star         54.3  16.8   47.8  26.9
## 11 v_lines      54.3  16.8   47.8  26.9
## 12 wide_lines   54.3  16.8   47.8  26.9
## 13 x_shape      54.3  16.8   47.8  26.9

如果上面代码不熟悉,可以用第 6 章的代码重新表达,也是一样的

df %>%
  group_by(dataset) %>%
  summarize(
    mean_x = mean(x),
    mean_y = mean(y),
    std_dev_x = sd(x),
    std_dev_y = sd(y),
    corr_x_y = cor(x, y)
  )
## # A tibble: 13 x 6
##    dataset   mean_x mean_y std_dev_x std_dev_y corr_x_y
##    <chr>      <dbl>  <dbl>     <dbl>     <dbl>    <dbl>
##  1 away        54.3   47.8      16.8      26.9  -0.0641
##  2 bullseye    54.3   47.8      16.8      26.9  -0.0686
##  3 circle      54.3   47.8      16.8      26.9  -0.0683
##  4 dino        54.3   47.8      16.8      26.9  -0.0645
##  5 dots        54.3   47.8      16.8      26.9  -0.0603
##  6 h_lines     54.3   47.8      16.8      26.9  -0.0617
##  7 high_lin~   54.3   47.8      16.8      26.9  -0.0685
##  8 slant_do~   54.3   47.8      16.8      26.9  -0.0690
##  9 slant_up    54.3   47.8      16.8      26.9  -0.0686
## 10 star        54.3   47.8      16.8      26.9  -0.0630
## 11 v_lines     54.3   47.8      16.8      26.9  -0.0694
## 12 wide_lin~   54.3   47.8      16.8      26.9  -0.0666
## 13 x_shape     54.3   47.8      16.8      26.9  -0.0656

那么,我们是否能得出结论,每组的数据长的差不多呢?然而,我们画图发现

ggplot(df, aes(x = x, y = y, colour = dataset)) +
  geom_point() +
  # geom_smooth(method = lm) +
  theme(legend.position = "none") +
  facet_wrap(~dataset, ncol = 3)

事实上,每张图都相差很大。所以,这里想说明的是,眼见为实。换句话说,可视化是数据探索中非常重要的部分。本章的目的就是带领大家学习ggplot2基本的绘图技能。

13.2 学习目标

  1. Grammer of Graphics

  1. data: 数据框data.frame (注意,不支持向量vector和列表list类型)

  2. aes: 数据框中的数据变量映射到图形属性。什么叫图形属性?就是图中点的位置、形状,大小,颜色等眼睛能看到的东西。什么叫映射?就是一种对应关系,比如数学中的函数b = f(a)就是ab之间的一种映射关系, a的值决定或者控制了b的值,在ggplot2语法里,a就是我们输入的数据变量,b就是图形属性, 这些图形属性包括:

    • x(x轴方向的位置)
    • y(y轴方向的位置)
    • color(点或者线等元素的颜色)
    • size(点或者线等元素的大小)
    • shape(点或者线等元素的形状)
    • alpha(点或者线等元素的透明度)
  3. geoms: 几何对象,确定我们想画什么样的图,一个geom_***确定一种图形。更多几何对象推荐阅读这里

    • geom_bar()
    • geom_density()
    • geom_freqpoly()
    • geom_histogram()
    • geom_violin()
    • geom_boxplot()
    • geom_col()
    • geom_point()
    • geom_smooth()
    • geom_tile()
    • geom_density2d()
    • geom_bin2d()
    • geom_hex()
    • geom_count()
    • geom_text()
    • geom_sf()
  4. stats: 统计变换

  5. scales: 标度

  6. coord: 坐标系统

  7. facet: 分面

  8. layer: 增加图层

  9. theme: 主题风格

  10. save: 保存图片

ggplot2图层语法框架

13.3 开始

前面讲到R语言数据类型有字符串型、数值型、因子型、逻辑型、日期型等,ggplot2会将字符串型、因子型、逻辑型、日期型默认为离散变量,而数值型默认为连续变量。我们在而呈现数据的时候,可能会同时用到多种类型的数据,比如

  • 一个离散

  • 一个连续

  • 两个离散

  • 两个连续

  • 一个离散, 一个连续

  • 三个连续

13.3.1 导入数据

gapdata <- read_csv("./demo_data/gapminder.csv")
gapdata
## # A tibble: 1,704 x 6
##    country    continent  year lifeExp     pop gdpPercap
##    <chr>      <chr>     <dbl>   <dbl>   <dbl>     <dbl>
##  1 Afghanist~ Asia       1952    28.8  8.43e6      779.
##  2 Afghanist~ Asia       1957    30.3  9.24e6      821.
##  3 Afghanist~ Asia       1962    32.0  1.03e7      853.
##  4 Afghanist~ Asia       1967    34.0  1.15e7      836.
##  5 Afghanist~ Asia       1972    36.1  1.31e7      740.
##  6 Afghanist~ Asia       1977    38.4  1.49e7      786.
##  7 Afghanist~ Asia       1982    39.9  1.29e7      978.
##  8 Afghanist~ Asia       1987    40.8  1.39e7      852.
##  9 Afghanist~ Asia       1992    41.7  1.63e7      649.
## 10 Afghanist~ Asia       1997    41.8  2.22e7      635.
## # ... with 1,694 more rows

13.3.2 检查数据

# 是否有缺失值

gapdata %>%
  summarise(
    across(everything(), ~ sum(is.na(.)))
  )
## # A tibble: 1 x 6
##   country continent  year lifeExp   pop gdpPercap
##     <int>     <int> <int>   <int> <int>     <int>
## 1       0         0     0       0     0         0
  • country 代表国家
  • countinet 表示所在的洲
  • year 时间
  • lifeExp 平均寿命
  • pop 人口数量
  • gdpPercap 人均GDP

接下来,我们需要思考我们应该选择什么样的图,呈现这些不同类型的数据,探索数据背后的故事

13.4 基本绘图

13.4.1 柱状图

常用于一个离散变量

gapdata %>%
  ggplot(aes(x = continent)) +
  geom_bar()

gapdata %>%
  ggplot(aes(x = reorder(continent, continent, length))) +
  geom_bar()

gapdata %>%
  ggplot(aes(x = reorder(continent, continent, length))) +
  geom_bar() +
  coord_flip()

# geom_bar vs stat_count
gapdata %>%
  ggplot(aes(x = continent)) +
  stat_count()

gapdata %>% count(continent)
## # A tibble: 5 x 2
##   continent     n
##   <chr>     <int>
## 1 Africa      624
## 2 Americas    300
## 3 Asia        396
## 4 Europe      360
## 5 Oceania      24

可见,geom_bar() 自动完成了这个统计,更多geom与stat对应关系见这里

gapdata %>%
  distinct(continent, country) %>%
  ggplot(aes(x = continent)) +
  geom_bar()

我个人比较喜欢先统计,然后画图

gapdata %>%
  distinct(continent, country) %>%
  group_by(continent) %>%
  summarise(num = n()) %>%
  ggplot(aes(x = continent, y = num)) +
  geom_col()

13.4.2 直方图

常用于一个连续变量

gapdata %>%
  ggplot(aes(x = lifeExp)) +
  geom_histogram() # 对应的stat_bin()

gapdata %>%
  ggplot(aes(x = lifeExp)) +
  geom_histogram(binwidth = 1)

#' histograms, 默认使用 `position = "stack"`
gapdata %>%
  ggplot(aes(x = lifeExp, fill = continent)) +
  geom_histogram()

#' 使用`position = "identity"`
gapdata %>%
  ggplot(aes(x = lifeExp, fill = continent)) +
  geom_histogram(position = "identity")

gapdata %>%
  ggplot(aes(x = lifeExp, color = continent)) +
  geom_freqpoly()

#' smooth histogram = densityplot
gapdata %>%
  ggplot(aes(x = lifeExp)) +
  geom_density()

如果不喜欢下面那条线,可以这样

gapdata %>%
  ggplot(aes(x = lifeExp)) +
  geom_line(stat = "density")

# adjust 调节bandwidth,
# adjust = 1/2 means use half of the default bandwidth.
gapdata %>%
  ggplot(aes(x = lifeExp)) +
  geom_density(adjust = 1)

gapdata %>%
  ggplot(aes(x = lifeExp)) +
  geom_density(adjust = 0.2)

gapdata %>%
  ggplot(aes(x = lifeExp, color = continent)) +
  geom_density()

gapdata %>%
  ggplot(aes(x = lifeExp, fill = continent)) +
  geom_density(alpha = 0.2)

gapdata %>%
  filter(continent != "Oceania") %>%
  ggplot(aes(x = lifeExp, fill = continent)) +
  geom_density(alpha = 0.2)

gapdata %>%
  ggplot(aes(x = lifeExp)) +
  geom_density() +
  # facet_wrap(vars(continent))
  facet_grid(. ~ continent)

gapdata %>%
  filter(continent != "Oceania") %>%
  ggplot(aes(x = lifeExp, fill = continent)) +
  geom_histogram() +
  facet_grid(continent ~ .)

直方图和密度图画在一起。注意y = stat(density)表示y是由x新生成的变量,这是一种固定写法,类似的还有stat(count), stat(level)

gapdata %>%
  filter(continent != "Oceania") %>%
  ggplot(aes(x = lifeExp, y = stat(density))) +
  geom_histogram(aes(fill = continent)) +
  geom_density() +
  facet_grid(continent ~ .)

13.4.3 箱线图

一个离散变量 + 一个连续变量

#' 思考下结果为什么是这样?
gapdata %>%
  ggplot(aes(x = year, y = lifeExp)) +
  geom_boxplot()

# 数据框中的year变量是数值型,需要先转换成因子型,弄成离散型变量
gapdata %>%
  ggplot(aes(x = as.factor(year), y = lifeExp)) +
  geom_boxplot()

# 明确指定分组变量
gapdata %>%
  ggplot(aes(x = year, y = lifeExp)) +
  geom_boxplot(aes(group = year))

gapdata %>%
  ggplot(aes(x = year, y = lifeExp)) +
  geom_violin(aes(group = year)) +
  geom_jitter(alpha = 1 / 4) +
  geom_smooth(se = FALSE)

13.4.4 抖散图

点重叠的处理方案

gapdata %>% ggplot(aes(x = continent, y = lifeExp)) +
  geom_point()

gapdata %>% ggplot(aes(x = continent, y = lifeExp)) +
  geom_jitter()

gapdata %>% ggplot(aes(x = continent, y = lifeExp)) +
  geom_boxplot()

gapdata %>% ggplot(aes(x = continent, y = lifeExp)) +
  geom_boxplot() +
  geom_jitter()

gapdata %>%
  ggplot(aes(x = continent, y = lifeExp)) +
  geom_jitter() +
  stat_summary(fun.y = median, colour = "red", geom = "point", size = 5)

gapdata %>%
  ggplot(aes(reorder(x = continent, lifeExp), y = lifeExp)) +
  geom_jitter() +
  stat_summary(fun.y = median, colour = "red", geom = "point", size = 5)

注意到我们已经提到过 stat_count / stat_bin / stat_summary

gapdata %>%
  ggplot(aes(x = continent, y = lifeExp)) +
  geom_violin(
    trim = FALSE,
    alpha = 0.5
  ) +
  stat_summary(
    fun.y = mean,
    fun.ymax = function(x) {
      mean(x) + sd(x)
    },
    fun.ymin = function(x) {
      mean(x) - sd(x)
    },
    geom = "pointrange"
  )

13.4.5 山峦图

常用于一个离散变量 + 一个连续变量

gapdata %>%
  ggplot(aes(
    x = lifeExp,
    y = continent,
    fill = continent
  )) +
  ggridges::geom_density_ridges()

# https://learnui.design/tools/data-color-picker.html#palette
gapdata %>%
  ggplot(aes(
    x = lifeExp,
    y = continent,
    fill = continent
  )) +
  ggridges::geom_density_ridges() +
  scale_fill_manual(
    values = c("#003f5c", "#58508d", "#bc5090", "#ff6361", "#ffa600")
  )

gapdata %>%
  ggplot(aes(
    x = lifeExp,
    y = continent,
    fill = continent
  )) +
  ggridges::geom_density_ridges() +
  scale_fill_manual(
    values = colorspace::sequential_hcl(5, palette = "Peach")
  )

13.4.6 散点图

常用于两个连续变量

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp)) +
  geom_point()

gapdata %>%
  ggplot(aes(x = log(gdpPercap), y = lifeExp)) +
  geom_point()

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp)) +
  geom_point() +
  scale_x_log10() # A better way to log transform

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp)) +
  geom_point(aes(color = continent))

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp)) +
  geom_point(alpha = (1 / 3), size = 2)

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp)) +
  geom_point() +
  geom_smooth()

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp)) +
  geom_point() +
  geom_smooth(lwd = 3, se = FALSE)

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp)) +
  geom_point() +
  geom_smooth(lwd = 3, se = FALSE, method = "lm")

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point() +
  geom_smooth(lwd = 3, se = FALSE, method = "lm")

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point(show.legend = FALSE) +
  facet_wrap(~continent)

jCountries <- c("Canada", "Rwanda", "Cambodia", "Mexico")

gapdata %>%
  filter(country %in% jCountries) %>%
  ggplot(aes(x = year, y = lifeExp, color = country)) +
  geom_line() +
  geom_point()

gapdata %>%
  filter(country %in% jCountries) %>%
  ggplot(aes(
    x = year, y = lifeExp,
    color = reorder(country, -1 * lifeExp, max)
  )) +
  geom_line() +
  geom_point()

这是一种技巧,但我更推荐以下方法

d1 <- gapdata %>%
  filter(country %in% jCountries) %>%
  group_by(country) %>%
  mutate(end_label = if_else(year == max(year), country, NA_character_))

d1
## # A tibble: 48 x 7
## # Groups:   country [4]
##    country continent  year lifeExp    pop gdpPercap
##    <chr>   <chr>     <dbl>   <dbl>  <dbl>     <dbl>
##  1 Cambod~ Asia       1952    39.4 4.69e6      368.
##  2 Cambod~ Asia       1957    41.4 5.32e6      434.
##  3 Cambod~ Asia       1962    43.4 6.08e6      497.
##  4 Cambod~ Asia       1967    45.4 6.96e6      523.
##  5 Cambod~ Asia       1972    40.3 7.45e6      422.
##  6 Cambod~ Asia       1977    31.2 6.98e6      525.
##  7 Cambod~ Asia       1982    51.0 7.27e6      624.
##  8 Cambod~ Asia       1987    53.9 8.37e6      684.
##  9 Cambod~ Asia       1992    55.8 1.02e7      682.
## 10 Cambod~ Asia       1997    56.5 1.18e7      734.
## # ... with 38 more rows, and 1 more variable:
## #   end_label <chr>
d1 %>% ggplot(aes(
  x = year, y = lifeExp, color = country
)) +
  geom_line() +
  geom_point() +
  geom_label(aes(label = end_label)) +
  theme(legend.position = "none")

如果觉得麻烦,就用gghighlight宏包吧

gapdata %>%
  filter(country %in% jCountries) %>%
  ggplot(aes(
    x = year, y = lifeExp, color = country
  )) +
  geom_line() +
  geom_point() +
  gghighlight::gghighlight()

13.4.7 点线图

gapdata %>%
  filter(continent == "Asia" & year == 2007) %>%
  ggplot(aes(x = lifeExp, y = country)) +
  geom_point()

gapdata %>%
  filter(continent == "Asia" & year == 2007) %>%
  ggplot(aes(
    x = lifeExp,
    y = reorder(country, lifeExp)
  )) +
  geom_point(color = "blue", size = 2) +
  geom_segment(aes(
    x = 40,
    xend = lifeExp,
    y = reorder(country, lifeExp),
    yend = reorder(country, lifeExp)
  ),
  color = "lightgrey"
  ) +
  labs(
    x = "Life Expectancy (years)",
    y = "",
    title = "Life Expectancy by Country",
    subtitle = "GapMinder data for Asia - 2007"
  ) +
  theme_minimal() +
  theme(
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  )

13.4.8 文本标注

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp)) +
  geom_point() +
  ggforce::geom_mark_ellipse(aes(
    filter = gdpPercap > 70000,
    label = "有钱的国家",
    description = "他们是什么国家?"
  ))

ten_countries <- gapdata %>%
  distinct(country) %>%
  pull() %>%
  sample(10)
library(ggrepel)
gapdata %>%
  filter(year == 2007) %>%
  mutate(
    label = ifelse(country %in% ten_countries, as.character(country), "")
  ) %>%
  ggplot(aes(log(gdpPercap), lifeExp)) +
  geom_point(
    size = 3.5,
    alpha = .9,
    shape = 21,
    col = "white",
    fill = "#0162B2"
  ) +
  geom_text_repel(
    aes(label = label),
    size = 4.5,
    point.padding = .2,
    box.padding = .3,
    force = 1,
    min.segment.length = 0
  ) +
  theme_minimal(14) +
  theme(
    legend.position = "none",
    panel.grid.minor = element_blank()
  ) +
  labs(
    x = "log(GDP per capita)",
    y = "life expectancy"
  )

13.4.9 errorbar图

avg_gapdata <- gapdata %>%
  group_by(continent) %>%
  summarise(
    mean = mean(lifeExp),
    sd = sd(lifeExp)
  )
avg_gapdata
## # A tibble: 5 x 3
##   continent  mean    sd
##   <chr>     <dbl> <dbl>
## 1 Africa     48.9  9.15
## 2 Americas   64.7  9.35
## 3 Asia       60.1 11.9 
## 4 Europe     71.9  5.43
## 5 Oceania    74.3  3.80
avg_gapdata %>%
  ggplot(aes(continent, mean, fill = continent)) +
  # geom_col(alpha = 0.5) +
  geom_point() +
  geom_errorbar(aes(ymin = mean - sd, ymax = mean + sd), width = 0.25)

13.4.10 椭圆图

gapdata %>%
  ggplot(aes(x = log(gdpPercap), y = lifeExp)) +
  geom_point() +
  stat_ellipse(type = "norm", level = 0.95)

13.4.11 2D 密度图

与一维的情形geom_density()类似, geom_density_2d(), geom_bin2d(), geom_hex()常用于刻画两个变量构成的二维区间的密度

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp)) +
  geom_bin2d()

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp)) +
  geom_hex()

13.4.12 马赛克图

geom_tile()geom_contour()geom_raster()常用于3个变量

gapdata %>%
  group_by(continent, year) %>%
  summarise(mean_lifeExp = mean(lifeExp)) %>%
  ggplot(aes(x = year, y = continent, fill = mean_lifeExp)) +
  geom_tile() +
  scale_fill_viridis_c()

事实上可以有更好的呈现方式

gapdata %>%
  group_by(continent, year) %>%
  summarise(mean_lifeExp = mean(lifeExp)) %>%
  ggplot(aes(x = year, y = continent, size = mean_lifeExp)) +
  geom_point()

gapdata %>%
  group_by(continent, year) %>%
  summarise(mean_lifeExp = mean(lifeExp)) %>%
  ggplot(aes(x = year, y = continent, size = mean_lifeExp)) +
  geom_point(shape = 21, color = "red", fill = "white") +
  scale_size_continuous(range = c(7, 15)) +
  geom_text(aes(label = round(mean_lifeExp, 2)), size = 3, color = "black") +
  theme(legend.position = "none")

13.5 主题风格

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point() +
  geom_smooth(lwd = 3, se = FALSE, method = "lm")

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point() +
  geom_smooth(lwd = 3, se = FALSE, method = "lm") +
  ggtitle("Life expectancy over time by continent")

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point() +
  geom_smooth(lwd = 3, se = FALSE, method = "lm") +
  theme_grey() # the default

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point() +
  geom_smooth(lwd = 3, se = FALSE, method = "lm") +
  theme_bw()

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point() +
  geom_smooth(lwd = 3, se = FALSE, method = "lm") +
  ggthemes::theme_calc() +
  ggtitle("ggthemes::theme_calc()")

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point() +
  geom_smooth(lwd = 3, se = FALSE, method = "lm") +
  ggthemes::theme_economist() +
  ggtitle("ggthemes::theme_economist()")

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point() +
  geom_smooth(lwd = 3, se = FALSE, method = "lm") +
  ggthemes::theme_economist_white() +
  ggtitle("ggthemes::theme_economist_white()")

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point() +
  geom_smooth(lwd = 3, se = FALSE, method = "lm") +
  ggthemes::theme_few() +
  ggtitle("ggthemes::theme_few()")

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point() +
  geom_smooth(lwd = 3, se = FALSE, method = "lm") +
  ggthemes::theme_gdocs() +
  ggtitle("ggthemes::theme_gdocs()")

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point() +
  geom_smooth(lwd = 3, se = FALSE, method = "lm") +
  ggthemes::theme_tufte() +
  ggtitle("ggthemes::theme_tufte()")

gapdata %>%
  ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
  geom_point() +
  geom_smooth(lwd = 3, se = FALSE, method = "lm") +
  ggthemes::theme_wsj() +
  ggtitle("ggthemes::theme_wsj()")

13.6 参考资料