Section 2 Tips for effective R programming

Before getting into actual R code, we’ll start with a few notes about how to use it most effectively. Bad coding habits can make your R code difficult to read and understand, so hopefully these tips will ensure you have good habits right from the start.

2.1 Writing readable code

There are two very good reasons to try to write your code in a clear, understandable way:

  • Other people might need to use your code.
  • You might need to use your code, a few weeks/months/years after you’ve written it.

It’s possible to write R code that “works” perfectly, and produces all the results and output you want, but proves very difficult to make changes to when you have to come back to it (because a reviewer asked for one additional analysis, etc.)

2.1.1 Basic formatting tips

You can improve the readability of your code a lot by following a few simple rules:

  • Put spaces between and around variable names and operators (=+-*/)
  • Break up long lines of code
  • Use meaningful variable names composed of 2 or 3 words (avoid abbreviations unless they’re very common and you use them very consistently)

These rules can mean the difference between this:

lm1=lm(y~grp+grpTime,mydf,subset=sext1=="m")

and this:

male_difference = lm(DepressionScore ~ Group + GroupTimeInteraction,
                     data = interview_data,
                     subset = BaselineSex == "Male")

R will treat both pieces of code exactly the same, but for any humans reading, the nicer layout and meaningful names make it much easier to understand what’s happening, and spot any errors in syntax or intent.

2.1.2 Keeping a consistent style

Try to follow a consistent style for naming things, e.g. using snake_case for all your variable names in your R code, and TitleCase for the columns in your data. Either style is probably better than lowercase with no spacing allmashedtogether.

2.1.3 Writing comments

One of the best things you can do to make R code readable and understandable is write comments - R ignores lines that start with # so you can write whatever you want and it won’t affect the way your code runs.

Comments that explain why something was done are great:

# Need to reverse code the score for question 3
data$DepressionQ3 = 4 - data$DepressionQ3

2.2 Don’t panic: dealing with SPSS withdrawal

2.2.1 R can read SPSS files (and csvs, and almost every kind of file!)

The haven package can read (and write) SPSS data files, so you can read in existing data. Download the SPSS file by right-clicking > open link in new tab on sharepoint and place it in your working directory. Then run:

data = haven::read_spss("Personality.sav")

2.2.2 RStudio has a data viewer

You can also use RStudio’s built-in data viewer to get a more familiar, spreadsheet style view of your data. In the Environment pane in the top-right, you can click on the name of any data you have loaded to bring up a spreadsheet view:

Data viewer example

Data viewer example

This also supports basic sorting and filtering so you can explore the data more easily (you’ll still need to write code using functions like arrange() or filter() if you want to actually make changes to the data though).

If we click on this data in our environment in the top right pane, we can see it in the data viewer.