Where does data (e.g., a data frame or tibble) come from? If we don’t enter it ourselves (e.g., with the
tribble commands (see Chapter 5 on tibbles) we usually import it from an external source. The scope of such sources is vast and here we only cover the most common candidates: Data that is already stored in text form or other file formats that can easily be coerced into linear or rectangular data structures.
This chapter discusses options and potential pitfalls when using the readr package (Wickham et al., 2018) for data import. readr provides fast and friendly ways for reading vectors and rectangular data files (like
fwf) and writing files in various formats.
After working through this chapter, you will be able to:
- orient yourself on your computer (i.e., know your working directory and specify absolute and relative paths to other directories);
- use readr commands to
- parse vectors of various data types;
- import files of various formats;
- export files in various formats;
- parse vectors of various data types;
- avoid exotic or proprietary file formats (not only in R).
6.1.3 Being here
A modern alternative to using the
setwd() functions is provided by the here package (Müller, 2017), which answers the question “Where am I?” in a straightforward manner: You are
here(). here determines the path to your current working directory (or project directory) when it is loaded and provides a
here() function that returns the name of this directory or other directories, whose names are provided as additional arguments (of type character):
library(here) # loads the package here::here() # returns your current main directory #>  "/Users/hneth/Desktop/stuff/Dropbox/GitHub/ds4psy_book" here::here("data") # returns the subdirectory "/data" #>  "/Users/hneth/Desktop/stuff/Dropbox/GitHub/ds4psy_book/data"
The brilliant idea of here is that all paths within a project can be specified relative to your current working directory, which is
Note: As the R package lubridate also contains a (deprecated) function named
here() (see Chapter 10: Time), we are using
here::here() here to explicate that we want to use the function from the here package.
6.1.4 Data used
In this chapter, we will use a variety of data files. As many of them are stored in non-standard formats, they are not included in the ds4psy package, but stored on a web server (at http://rpository.com). Below, we will illustrate how they can be imported directly from their online source. Alternatively, you can use a web browser to download the files to a directory on your computer (e.g., in a sub-directory called
data) and import them from there.
6.1.5 Getting ready
This chapter formerly assumed that you have read and worked through Chapter 11: Import data of the r4ds textbook (Wickham & Grolemund, 2017). It now can be read by itself, but reading Chapter 11: Import data of r4ds is still recommended.
Please do the following to get started:
Structure your document by inserting headings and empty lines between different parts. Here’s an example how your initial file could look:
--- title: "Chapter 6: Importing data" author: "Your name" date: "2020 May 25" output: html_document --- Add text or code chunks here. # Exercises (06: Importing data) ## Exercise 1 ## Exercise 2 etc. <!-- The end (eof). -->
Create an initial code chunk below the header of your
.Rmdfile that loads the R packages of the tidyverse (and see Section E.3.3 if you want to get rid of the messages and warnings of this chunk in your HTML output).
Save your file (e.g., as
06_import.Rmdin the R folder of your current project) and remember saving and knitting it regularly as you keep adding content to it.
Now that we can orient ourselves on our computers and navigate between various directories, we are ready to read more about reading data with readr.
Müller, K. (2017). here: A simpler way to find your files. Retrieved from https://CRAN.R-project.org/package=here
Wickham, H., & Grolemund, G. (2017). R for data science: Import, tidy, transform, visualize, and model data. Retrieved from http://r4ds.had.co.nz
Wickham, H., Hester, J., & Francois, R. (2018). readr: Read rectangular text data. Retrieved from https://CRAN.R-project.org/package=readr