4.2 Importing data with readr

The readr package provides functions for reading (or “parsing”) vectors and data files into rectangular R objects known as tibbles.

Resources for this section include:

Key concepts of this section include:

  • working directory
  • file paths (absolute vs. relative)
  • reading vs. writing files

4.2.1 File locations and paths

A well-organized project typically contains various (sub-)directories for storing different types of data. For instance, many projects contain dedicated sub-directories for data, images, or code files.

The fact that not all files are stored in the same directory makes it necessary to know or set one’s current working directory, as well as point to the locations of files in other directories. When working with RStudio projects, R sets a session’s original working directory to the project folder.

File paths are descriptions of locations on a computer, typically encoded as character strings. They usually need to be specified when loading a data file or linking to an image, as well as other files.

To make an R project as self-contained as possible (i.e., independent of the particular folder structure on our personal computer), all files needed in a project should be stored in the project folder or its sub-directories. When including a file from some folder, always use relative file paths to specify its location.

Key commands for getting and setting file paths in R include:

# (1) Getting and setting file path: 
getwd()  # get current (absolute) file path
wd <- getwd()  # store file path

setwd(wd)  # set current (absolute) file path

# (2) Navigating relative file paths:
setwd(".")       # "." marks current location
setwd("./data")  # move 1 level down into "data" (if "data" exists)

setwd("..")    # move 1 level upwards 
setwd("./..")  # move 1 level upwards (from current location)

# Assuming 2 sub-directories ("./code" and "./data"):
setwd("code")     # move down into directory "code" 
setwd("../data")  # move into parallel directory "data"
setwd("../code")  # move into parallel directory "code"
setwd("..")       # move 1 level up

The here package (Müller, 2017) simplifies these commands, but also requires an understanding of file paths.

4.2.2 Reading and writing files

Key readr functions include:

  • read_csv() vs. read_csv2() for reading comma-separated data files

  • read_delim() for reading data files not delimited by commas

  • write_csv() vs. write_csv2() for writing comma-separated data files

  • write_delim() for writing data files not delimited by commas