1 Tutorial 1: Install R/R Studio

After working through Tutorial 1, you’ll…

  • know how to install R and R Studio
  • understand the main set-up of R Studio

Important: You only have to install R/R Studio on your personal PC - both are already installed at the CIP Pool at the Akademiestr.

1.1 Install R

R is the programming language we’ll use to import, edit, and analyze data. Please use Cran to install the newest version of R (version 4.3.1, “Beagle Scouts”). You’ll have to specify your operation system to download the right version:

If you already have R on your computer, simply make sure to update it to the newest version. To do so, simply download and install the latest version from the links above. Then, via “Tools/Global Options/General/R version”, choose the newest version. You may have to close and re-open R for the newer version to take into effect.

1.2 Install R Studio

Next, install R Studio. R Studio is a desktop application with a graphical interface that facilitates programming with R. The newest version of R Studio (2023-09-28) can be downloaded via this Link.

If you already have R Studio on your computer, simply make sure to update it to the newest version. The easiest way to do this is via: “Help/Check for Updates” in R Studio.

1.3 Object- and function-oriented programming

First things first: R is an object- and function-oriented programming language. Chambers (2014, p. 4) has nicely summarized what this means:

  • Everything that exists is an object.
  • Everything that happens is a function call.

We assign values (for instance, single numbers/letters, several numbers/letters, or whole data sets) to objects in R to work with them/do computations. For instance, the following command in R would assign the word “hello” to the object word by using the <- sign (a function used for assigning values to objects):

word <- "hello"

Objects have specific properties that determine which types of calculations can be done with them (and which cannot). For instance, the object word is characterized by the fact that it consists of characters (i.e., a word) - which may, for instance, prohibit you to calculate the mean of this object (which is something only possible for objects consisting of numeric data).

This already hints towards the second, important aspect of R: It is influenced by functional programming, meaning that everything we do in R is a function call. R uses functions to assign specific values (for instance, a single number or word, several numbers or words, whole data sets) to objects. In short, what you will learn by learning programming in R is how to write functions for making R do the calculations you need.

Let’s take SPSS - which, I assume, many of you have worked with previously - as a comparison. If you import, edit, or analyze a data set in SPSS, you’ll use the click-and-point menu to change variable values, calculate variables’ means, or export data.

R works differently: You assign the data set to an object - for instance an object called “word”, as done previously. This assignment enables you to work with the data: You can now call specific functions to work with or change this object. For instance, if you want to add another word to the object word, for instance “and good morning”, you could do that by using a specific function called paste0(), which takes the original object word and adds your additional words ” and good morning” to overwrite the old object word:

word <- paste0(word, " and good morning")
word
## [1] "hello and good morning"

Different to using the click-and-point menu in SPSS, you thus need to write your own code to import, edit, or analyze any type of data in R. Luckily, R already includes a lot of predefined functions meaning that we do not have write all of these functions ourselves.

1.4 Base-R vs. tidyverse & coding style

There are two “ways” of using R: via Base-R or via the tidyverse, see this overview for a comparison.

Base-R includes all functions that come with initially installing R. A more recent development is the tidyverse, a collection of additional packages (and coding conventions) that is especially suited for beginners. Tidyverse was developed by Hadley Wickham and his team at RStudio (the interface we use for programming with R).

Best practice when coding with R is to either stick to Base-R or the tidyverse, as this makes your code more streamlined and readabile. In this class, we will rely on the tidyverse. You will learn a little bit of Base-R in tutorials 2-3 to know how to read this code (e.g., when stumbling across online forums when looking for R-related code solutions). However, we will then rely on tidyverse in subsequent tutorials.

This also means that I will use the tidyverse coding style. With this coding style, we make sure to consistently name things (e.g., with lower letters), use blank spaces, where needed. In short, it makes our code more readable.

If your’re interested in this, see a nice summary by IfKW scholar Lara Kobilke here.

1.5 Why should I use R?

There are several reasons why I’m an advocate of R (or similar programming languages such as Python) over programs such as SPSS.

  1. R is free. Other than most other (statistical) programs, you do not need to buy it (or rely on an university license, that is likely to run out once you leave your department).

  2. R is an open source program. Other than most other programs, the source code - i.e., the basis of the program - is freely available. So are the hundred of packages (we’ll get to those later - these are basically additional functions you may need for more specific analyses) on CRAN that you can use to extend R’s base functions.

  3. R offers you flexibility. You can work with almost any type of data and rely on a large (!) set of functions to import, edit, or analyze such data. And if the function you need to do so hasn’t been implemented (or simply does not exist yet), you can write it yourself!

  4. Learning R increases your chances on the job market. For many jobs (academia, market research, data science, data journalism), applicants should know at least one programming language.

💡 Take-Aways

R is great! 😁