13.1 skimr
skimr(Waring et al. 2019) 是由 rOpenSci project 开发的用于探索性数据分析的包,可以看作增强版的 summary()
,根据不同的列类型返回整洁有用的统计量。如:
library(skimr)
skim(iris)
#> -- Data Summary ------------------------
#> Values
#> Name iris
#> Number of rows 150
#> Number of columns 5
#> _______________________
#> Column type frequency:
#> factor 1
#> numeric 4
#> ________________________
#> Group variables None
#>
#> -- Variable type: factor -------------------------------------------------------
#> # A tibble: 1 x 6
#> skim_variable n_missing complete_rate ordered n_unique
#> * <chr> <int> <dbl> <lgl> <int>
#> 1 Species 0 1 FALSE 3
#> top_counts
#> * <chr>
#> 1 set: 50, ver: 50, vir: 50
#>
#> -- Variable type: numeric ------------------------------------------------------
#> # A tibble: 4 x 11
#> skim_variable n_missing complete_rate mean sd p0 p25 p50 p75
#> * <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Sepal.Length 0 1 5.84 0.828 4.3 5.1 5.8 6.4
#> 2 Sepal.Width 0 1 3.06 0.436 2 2.8 3 3.3
#> 3 Petal.Length 0 1 3.76 1.77 1 1.6 4.35 5.1
#> 4 Petal.Width 0 1 1.20 0.762 0.1 0.3 1.3 1.8
#> p100 hist
#> * <dbl> <chr>
#> 1 7.9 <U+2586><U+2587><U+2587><U+2585><U+2582>
#> 2 4.4 <U+2581><U+2586><U+2587><U+2582><U+2581>
#> 3 6.9 <U+2587><U+2581><U+2586><U+2587><U+2582>
#> 4 2.5 <U+2587><U+2581><U+2587><U+2585><U+2583>
由于 skim()
的返回结果在 bookdown 里显示效果不太好,这里只给出一个最简单的例子,关于该包的具体使用可见 Introduction to skimr