Chapter 5 Transforming data

Anyone regularly working with data is aware that transforming data (aka. “data munging” or “data wrangling”) is an essential pre-requisite for any successful data analysis.

Key topics (and corresponding R packages) of this chapter are:

All these sections and packages are designed for manipulating data structures (mostly vectors or tables) into other data structures (more vectors or tables).

Although transforming data can be viewed as a challenge and a task in itself, our primary goal usually consists in gaining insights into the contents of our data. From this perspective, transforming data (e.g., into “tidy” data) becomes an intermediate goal, or a means to another end — a way of representing our data so that it can be processed more easily and rapidly.

Preparation

Recommended readings for this chapter include:

of the ds4psy book (Neth, 2021), and the corresponding chapters

of the r4ds book (Wickham & Grolemund, 2017).

Preflections

Before reading, please take some time to reflect upon the following questions:

i2ds: Preflexions

  • Assuming we had all the data required for answering our question, which additional obstacles would we face?

  • The same data can be stored in different data structures. Which ones? (Think in terms of different data types, data shapes, and corresponding data structures.)

  • Does it matter in what shape data is stored? Why or why not?

References

Bache, S. M., & Wickham, H. (2014). magrittr: A forward-pipe operator for R. https://CRAN.R-project.org/package=magrittr
Neth, H. (2021). Data science for psychologists. Social Psychology; Decision Sciences, University of Konstanz. https://bookdown.org/hneth/ds4psy/
Wickham, H., François, R., Henry, L., & Müller, K. (2021). dplyr: A grammar of data manipulation. https://CRAN.R-project.org/package=dplyr
Wickham, H., & Grolemund, G. (2017). R for data science: Import, tidy, transform, visualize, and model data. O’Reilly Media, Inc. http://r4ds.had.co.nz