Session 5 Getting data in and out of R

We can’t get very far with R if we can’t get our raw data in, and our results out. This session introduces some of the basic techniques to do this.

5.1 Working directory

R can read data from files on disk. First, we need to understand where R will look for data. Running the command getwd() in the R console will reveal a file path, which is where R will look first for files. This is also likely to be the directory displayed in the file pane (bottom right hand corner in R studio). This directory is called the working directory.

You can customise the working directory, using Session > Set working directory, and the options therein. Alternatively, you can use the setwd() command in the console. You can supply a file path, e.g. setwd("C:/my/file/path").

More on directories when we introduce Projects later. For now, note that any files will need to be placed in the working directory for R to find them (or in a subdirectory).

In general, it makes sense to also save your script files in the working directory (or a subdirectory of it).

Exercise: choose a sensible working directory, and save your current script file to that directory.

5.2 Base R solutions to importing data

Note: ‘Base R’ refers to the functions that come with R as standard, without installing additional packages.

A simple base R solution to reading in data is to use the read.csv function. This reads data from the csv file format. Note that simple Excel spreadsheets can be saved in this format.

Once file.csv is placed in the working directory, it can be read into R using x <- read.csv("file.csv"), which reads in the file and assigns it to x. By default x will be a data frame.

Note that you can also place files into a subdirectory of the working directory. For example, it may be convenient to have a data subfolder, in which case you would use x <- read.csv("data/file.csv").

Exercise, click on the link to download the file CHD.csv. To download this file, right click on the hyperlink and select ‘save link as.’ Download or move it to your working directory, and read it into an R object called `chd’.

5.3 Exporting data

Similarly, you can export data back out (in an appropriate format), using write.csv. Here you need to specify both the object to write, and the file to write it to, e.g. write.csv(x, "file.csv").

Exercise: write the object chd back to disk, in a new file called chd2.csv.

5.4 Further reading

There are many, many more options for reading and writing data, here are some resources to find out more: