Section 2 Tips for effective R programming
Before getting into actual R code, we’ll start with a few notes about how to use it most effectively. Bad coding habits can make your R code difficult to read and understand, so hopefully these tips will ensure you have good habits right from the start.
2.1 Writing readable code
There are two very good reasons to try to write your code in a clear, understandable way:
- Other people might need to use your code.
- You might need to use your code, a few weeks/months/years after you’ve written it.
It’s possible to write R code that “works” perfectly, and produces all the results and output you want, but proves very difficult to make changes to when you have to come back to it (because a reviewer asked for one additional analysis, etc.)
2.1.1 Basic formatting tips
You can improve the readability of your code a lot by following a few simple rules:
- Put spaces between and around variable names and operators (
=+-*/
) - Break up long lines of code
- Use meaningful variable names composed of 2 or 3 words (avoid abbreviations unless they’re very common and you use them very consistently)
These rules can mean the difference between this:
=lm(y~grp+grpTime,mydf,subset=sext1=="m") lm1
and this:
= lm(DepressionScore ~ Group + GroupTimeInteraction,
male_difference data = interview_data,
subset = BaselineSex == "Male")
R will treat both pieces of code exactly the same, but for any humans reading, the nicer layout and meaningful names make it much easier to understand what’s happening, and spot any errors in syntax or intent.
2.1.2 Keeping a consistent style
Try to follow a consistent style for naming things, e.g. using snake_case
for all your variable names in your R code, and TitleCase
for the
columns in your data. Either style is probably better than lowercase with
no spacing allmashedtogether
.
2.1.3 Writing comments
One of the best things you can do to make R code readable and
understandable is write comments - R ignores lines that start with
#
so you can write whatever you want and it won’t affect
the way your code runs.
Comments that explain why something was done are great:
# Need to reverse code the score for question 3
$DepressionQ3 = 4 - data$DepressionQ3 data
2.2 Don’t panic: dealing with SPSS withdrawal
2.2.1 R can read SPSS files (and csvs, and almost every kind of file!)
The haven
package can read (and write) SPSS data files, so you
can read in existing data.
Download the SPSS file by right-clicking > open link in new tab on sharepoint and place it in your working directory. Then run:
= haven::read_spss("Personality.sav") data
2.2.2 RStudio has a data viewer
You can also use RStudio’s built-in data viewer to get a more familiar, spreadsheet style view of your data. In the Environment pane in the top-right, you can click on the name of any data you have loaded to bring up a spreadsheet view:
This also supports basic sorting and filtering so you can explore
the data more easily (you’ll still need to write code using functions
like arrange()
or filter()
if you want to actually make
changes to the data though).
If we click on this data in our environment in the top right pane, we can see it in the data viewer.