Session 11 Programming with style, and further topics.

Here are some examples of code that, arguably, is not well written. Can you say what’s wrong, and improve it?

11.1 Indentation and spacing

  • Let your code breathe.

  • Leave space around operators.

# BAD:
x<-2*3

# GOOD:
x <- 2 * 3
  • Indent loops and if statements.
# opening paranthesis after the for
# closing parenthesis on own line
# indent two spaces between

n <- 10

for(i in 1:n){
  print(i)
}

print("Done!")
# if / else like this:

if(x == 5){
  print("x is 5")
} else {
  print("x is not 5")
}

# this makes it easy to chain if/else statements
# but don't make the chains too long!

if(x == 5){
  print("x is 5")
} else if(x == 6){
  print("x is 6")
} else {
  print("x is not 5 or 6")
}

11.2 Naming functions and variables

  • Use a name that concisely describes the function or variable
# BAD:
function1 <- function(a){
  a^2
}

# GOOD:
square <- function(a){
  a^2
}
  • Naming convention: choose one, and avoid mixing: snake_case or camelCase but avoid separating.with.dots.

11.3 Don’t repeat yourself

  • If you copy/paste some code and ‘tweak’ it, that’s a sign you need to rewrite your code, e.g. using a loop or a function.
# BAD:
first <- df$result[1]^2
second <- df$result[2]^2
third <- df$result[3]^2

# GOOD:
results <- df$result[1:3]^2

11.4 Avoid writing cryptic and unclear code

  • Prefer to use piping to give a step-by-step explanation of what code is doing.

  • Code will be read many more times than it is written.

# BAD:
TRUE * 1

# GOOD:
as.numeric(TRUE)

11.5 Comments should say why, not what

# BAD:

# assign 1 to x
x <- 1


# GOOD:

# this initialises the counter
x <- 1

11.6 Never Hard coding !

h <- c(1, 2, 3)
# BAD:
#Codes from hard coding are not re-useable
h.mean <- sum(h) / 3

# GOOD:
h.mean <- sum(h) / length(h)

A discussion around hard coding

11.7 Other assorted advice, and further reading.

  • Functions should have one job. If a function does many jobs, it should be split into many functions.

  • Functions should be short and clear. Rule of thumb: if you can’t see a whole function on the screen, split into smaller functions.

  • Avoid ‘spaghetti code’. Keep functions separate from analysis code.

  • The google style guide is a very good one: https://google.github.io/styleguide/Rguide.xml

  • There is also a free eBook for the tidyverse style: https://style.tidyverse.org/

  • This document lists what it suggests to be a “best description of the standards followed by the leading R programmers”: ftp://ftp.math.ethz.ch/sfs/pub/Software/CRAN/web/packages/rockchalk/vignettes/Rstyle.pdf

  • Always remember: someone else is likely to need to read your code in the future. The most likely person is you in six months time. Six-months-ago-you doesn’t reply to emails.

11.8 Further Topics

To keep this course brief, there are a number of topics that we omitted from the course. Some examples of these are below, just to give a flavour of what is possible. Further concepts in R, specific to statistics and machine learning, will be introduced on other modules.