This section provides some pointers to additional resources on tidy data.
7.5.1 Help on tidying data
For additional details on the tidyr package (Wickham & Henry, 2020):
vignette("pivot"), as well as the documentations of
study the RStudio cheatsheet on reshaping data with the tidyr package (on the back of the Data Import cheatsheet):
For background information on the notion of tidy data, see the following paper by Hadley Wickham (2014b):
- Wickham, H. (2014). Tidy data. Journal of Statistical Software, 59(10), 1–23. doi: 10.18637/jss.v059.i10 (available at https://www.jstatsoft.org/article/view/v059i10)
For a critical view, see the following blog post:
- What is “tidy data?” (by John Mount)
The section Related work on https://tidyr.tidyverse.org provides some historical notes (e.g., on the relation between tidyr and reshape), pointers on terminology between different frameworks (e.g., SQL), and recommends the following papers:
An interactive framework for data cleaning (Potter’s wheel)
A powerful alternative framework to data cleaning and wrangling is provided by the data.table package (Dowle & Srinivasan, 2021).
- See https://rdatatable.gitlab.io/data.table/ and the documentation to get started.
Check out Wikipedia: Tidy data for additional details and links.
The commands of tidyr are first steps, rather than the ultimate solution to data wrangling. This area is currently under active development and only the future will show which framework will ultimately be adopted. And rather than despairing about technological changes, we all should feel happy — as in the Chinese proverb — to live in interesting times…
[07_tidy.Rmd updated on 2021-09-22 20:38:33 by hn.]