This book provides an introduction to data science that is tailored to the needs of psychologists, but is also suitable for students of the humanities and other biological or social sciences. This audience typically has a basic familiarity with statistics, but rarely an idea how data is prepared and shaped for being analyzed and tested. By working with a variety of data types and including many examples, this text teaches tools for transforming, summarizing, and visualizing data. By keeping our eyes open for the perils of misleading representations, the book fosters fundamental skills of data literacy and cultivates reproducible research practices that enable and precede any practical use of statistics.

The materials in this book are based on a course at the University of Konstanz in 2020. The course provides an introduction to data science in R (R Core Team, 2020) from a tidyverse (Wickham, Averick, et al., 2019) perspective and previously relied on R for Data Science (Wickham & Grolemund, 2017) as its textbook. Both this book and course are supported by the R package ds4psy (Neth, 2020), which provides datasets and functions used in the examples and exercises.


Neth, H. (2020). ds4psy: Data science for psychologists. Retrieved from

R Core Team. (2020). R: A language and environment for statistical computing. Retrieved from

Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686.

Wickham, H., & Grolemund, G. (2017). R for data science: Import, tidy, transform, visualize, and model data. Retrieved from