This book provides an introduction to data science that is tailored to the needs of students in psychology, but is also suitable for students of the humanities and other biological or social sciences. This audience typically has a basic familiarity with statistics, but rarely an idea how data is prepared for statistical testing. By working with a variety of data types and many examples, this text teaches strategies and tools for reshaping, summarizing, and visualizing data. By keeping our eyes open for the perils of misleading representations, the book fosters fundamental skills of data literacy and cultivates reproducible research practices that enable and precede any practical use of statistics.

The materials in this book are based on a course at the University of Konstanz, but can be used independently for a variety of curricula and purposes (see About for suggestions). The book is targeted at advanced undergraduate students and provides an introduction to data science in R (R Core Team, 2024) from a tidyverse (Wickham et al., 2019) perspective. Book and course are supported by the R package ds4psy (Neth, 2023), which provides datasets and functions used in the examples and exercises.


Neth, H. (2023). ds4psy: Data science for psychologists.
R Core Team. (2024). R base: A language and environment for statistical computing. Retrieved from
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686.