第 3 章 dplyr

dplyr取代传统的plyr,最大的痛点在于dplyr复现plyr::ddply函数功能较为麻烦。

虽然,do函数可以实现ddply的功能,但do即将退休。

library(dplyr)

# some complex function
func <- function(x) {
    mod <- lm(Sepal.Length ~ Petal.Width, data = x)
    mod_coefs <- broom::tidy(mod)

    tibble(
        mean_sepal_length = mean(x$Sepal.Length),
        mean_petal_width = mean(x$Petal.Width),
        slope = mod_coefs[[2, 2]],
        slope_p = mod_coefs[[2, 5]]
    )
}

# plyr version
plyr::ddply(iris, "Species", func)
##      Species mean_sepal_length mean_petal_width  slope   slope_p
## 1     setosa             5.006            0.246 0.9302 5.053e-02
## 2 versicolor             5.936            1.326 1.4264 4.035e-05
## 3  virginica             6.588            2.026 0.6508 4.798e-02
# dplyr with do()
iris %>%
    group_by(Species) %>%
    do(func(.))
## # A tibble: 3 x 5
## # Groups:   Species [3]
##   Species    mean_sepal_length mean_petal_width slope   slope_p
##   <fct>                  <dbl>            <dbl> <dbl>     <dbl>
## 1 setosa                  5.01            0.246 0.930 0.0505   
## 2 versicolor              5.94            1.33  1.43  0.0000404
## 3 virginica               6.59            2.03  0.651 0.0480