This section provides some pointers to additional resources on tidy data.
7.5.1 Help on tidying data
For additional details on the tidyr package (Wickham & Henry, 2020):
vignette("pivot"), as well as the documentations of
study the R Studio Cheat Sheet on reshaping data with the tidyr package (on the back of the Data Import cheat sheet):
- Wickham, H. (2014). Tidy data. Journal of Statistical Software, 59(10), 1–23. doi: 10.18637/jss.v059.i10 (available at https://www.jstatsoft.org/article/view/v059i10)
For a critical view, see the following blog post:
- What is “tidy data”? (by John Mount)
The section Related work on https://tidyr.tidyverse.org provides some historical notes (e.g., on the relation between tidyr and reshape), pointers on terminology between different frameworks (e.g., SQL), and recommends the following papers:
An interactive framework for data cleaning (Potter’s wheel)
A powerful alternative framework to data cleaning and wrangling is provided by the data.table package (Dowle & Srinivasan, 2019).
- See https://rdatatable.gitlab.io/data.table/ and the documentation to get started.
Check out Wikipedia: Tidy data for additional details and links.
The commands of tidyr are first steps, rather than the ultimate solution to data wrangling. This area is currently under active development and only the future will show which framework will ultimately be adopted. And rather than despairing about technological changes, we all should feel happy — as in the Chinese proverb — to live in interesting times…
[07_tidy.Rmd updated on 2020-03-24 22:29:54 by hn.]
Dowle, M., & Srinivasan, A. (2019). data.table: Extension of ‘data.frame‘. Retrieved from https://CRAN.R-project.org/package=data.table
Wickham, H. (2014b). Tidy data. Journal of Statistical Software, 59(10), 1–23. https://doi.org/10.18637/jss.v059.i10
Wickham, H., & Grolemund, G. (2017). R for data science: Import, tidy, transform, visualize, and model data. Retrieved from http://r4ds.had.co.nz
Wickham, H., & Henry, L. (2020). tidyr: Easily tidy data with ’spread()’ and ’gather()’ functions. Retrieved from https://CRAN.R-project.org/package=tidyr