Chapter 5 Getting data
Where does data come from?
We can get data — typically in the form of vectors or tables — by importing it from other sources or by creating data from scratch.
In both cases, we want to end up with a rectangular data structure known as a tibble, which is a simplified version of an R data frame.
Two key topics (and corresponding R packages) of this chapter are:
Importing data with the readr package (Wickham et al., 2018)
Creating tibbles with the tibble package (Müller & Wickham, 2021)
As both these packages belong to the tidyverse (Wickham et al., 2019), they create a rectangular data structure known as a tibble.
Key concepts of this chapter include the notions of file paths, parsing vectors and files, and various types of rectangular data structures (data frames, matrices, vs. tibbles).
Preparation
Recommended readings for this chapter include:
of the ds4psy book (Neth, 2022a), and the corresponding chapters
of the r4ds book (Wickham & Grolemund, 2017).
Preflections
Before reading, please take some time to reflect upon the following questions:
If we were interested in answering some interesting question, what data would we require?
Where do we get data from? Who has or provides it? Why or why not?
If someone had all the data we need, how would we obtain, load or enter it into R?
In this chapter, we will learn how to import or create tabular data structures in R.