17.3 Imputation
https://jiangjun.link/post/2018/12/r-missing-data
用中位数或均值填补缺失值:
df <- tibble(x = c(1, 2, NA, 5, 9, NA),
y = c(NA, 20, 1, NA, 5, NA),
z = 5:10)
df %>% mutate_all( ~ ifelse(is.na(.x), median(.x, na.rm = T), .))
#> # A tibble: 6 x 3
#> x y z
#> <dbl> <dbl> <int>
#> 1 1 5 5
#> 2 2 20 6
#> 3 3.5 1 7
#> 4 5 5 8
#> 5 9 5 9
#> 6 3.5 5 10
rlang::%|%
simputation