4.2 Importing data with readr
The readr package provides functions for reading (or “parsing”) vectors and data files into rectangular R objects known as tibbles.
Resources for this section include:
The R package readr (Wickham, Hester, & Francois, 2018)
Key concepts of this section include:
- working directory
- file paths (absolute vs. relative)
- reading vs. writing files
4.2.1 File locations and paths
A well-organized project typically contains various (sub-)directories for storing different types of data. For instance, many projects contain dedicated sub-directories for
The fact that not all files are stored in the same directory makes it necessary to know or set one’s current working directory, as well as point to the locations of files in other directories. When working with RStudio projects, R sets a session’s original working directory to the project folder.
File paths are descriptions of locations on a computer, typically encoded as character strings. They usually need to be specified when loading a data file or linking to an image, as well as other files.
To make an R project as self-contained as possible (i.e., independent of the particular folder structure on our personal computer), all files needed in a project should be stored in the project folder or its sub-directories. When including a file from some folder, always use relative file paths to specify its location.
Key commands for getting and setting file paths in R include:
# (1) Getting and setting file path: getwd() # get current (absolute) file path <- getwd() # store file path wd setwd(wd) # set current (absolute) file path # (2) Navigating relative file paths: setwd(".") # "." marks current location setwd("./data") # move 1 level down into "data" (if "data" exists) setwd("..") # move 1 level upwards setwd("./..") # move 1 level upwards (from current location) # Assuming 2 sub-directories ("./code" and "./data"): setwd("code") # move down into directory "code" setwd("../data") # move into parallel directory "data" setwd("../code") # move into parallel directory "code" setwd("..") # move 1 level up
The here package (Müller, 2017) simplifies these commands, but also requires an understanding of file paths.
4.2.2 Reading and writing files
Key readr functions include:
read_csv2()for reading comma-separated data files
read_delim()for reading data files not delimited by commas
write_csv2()for writing comma-separated data files
write_delim()for writing data files not delimited by commas