Chapter 6 Importing data
In the previous chapter on tibbles (Chapter 5), we learned how data can be entered from scratch. However, we usually want to analyze data that we obtain from some source in some electronic format. Hence, importing data is the rule, rather than the exception.
Importing data is an early and usually mundane step in the process of data analysis. Under ideal circumstances, reading data would be so seamless that it would remain unnoticed. The fact that we need a chapter on it, reminds us that our world is not ideal: Depending on their sources and types, messy datasets can be difficult to read. This is unfortunate, as it often prevents people from using R and drives them to use less powerful software, which suggest that they are easier or more convenient to use.
Fortunately, R has tools to facilitate the import of all kinds of datasets. One such tool is the readr package (Wickham et al., 2018) — a core component of the tidyverse (Wickham, 2017) that provides a range of functions to read data from a variety of files and formats.
While importing and exporting data may be a rather mundane step in any data analysis, they often are necessary for everything else that follows. Thus, it pays off to take a closer look at the process of importing data and learn some tricks to deal with obstinate datasets. Similarly, some knowledge about file paths and data formats is a precondition for saving your own files in forms that make them accessible to others. This chapter should reduce future frustrations by covering the most important cases.