5 Exercise 3 - The taste of cheese

Data: (\(y_i\), \(x_i\)) \(i = 1,...,30\)

Model: \(\mathbb{E}(Y_i) = \alpha + \beta x_i\), \(Var(Y_i) = \sigma^2\)

Context: This model might be considered for an experiment involving the chemical constituents of cheese and its taste. The dataset contains the concentrations of acetic acid, hydrogen sulphide (\(H_2S\)) and lactic acid, as well as a subjective taste score. It is of interest to investigate the effects of the different acids on the taste score.

Dataset: cheese.csv

Columns:

          C1: Case - Number of sample
          C2: Taste - Taste score
          C3: Acetic.Acid - Acetic acid concentration
          C4: H2S - \(H_2S\) concentration
          C5: Lactic.Acid - Lactic acid concentration

You can read in the data using:

cheese <- read.csv("cheese.csv")

5.1 Exploratory analysis

  1. Produce scatterplots of Taste (\(y\)) against Lactic.Acid (\(x\)), and Taste (\(y\)) against H2S (\(x\)).
# taste vs lactic acid
plot(Taste ~ Lactic.Acid, data = cheese, xlab = "Lactic acid concentration", ylab = "Taste score")
# taste vs H2S
plot(Taste ~ H2S, data = cheese, xlab = "H2S concentration", ylab = "Taste score")
  1. Now plot Taste against log(H2S), and against log(Lactic.Acid). The command in R to perform a natural logarithmic transform is, for example, log(H2S).
# taste vs lactic acid
plot(Taste ~ log(Lactic.Acid), data = cheese, xlab = " Log lactic acid concentration", ylab = "Taste score")
# taste vs H2S
plot(Taste ~ log(H2S), data = cheese, xlab = "Log H2S concentration", ylab = "Taste score")
  1. Which of the 4 variables (H2S, log(H2S), Lactic.Acid, log(Lactic.Acid)) seems best for describing a linear relationship with Taste?

Which of the four plots shows a straight/close-to-straight line similar to the line \(y=x\)?

5.2 Fitting a model

  1. Fit a linear regression (using the lm command) with Taste as the response variable and the explanatory variable you selected from part (c). Make a note of the fitted model.
model <- lm(Taste ~ log(H2S), data = cheese)
  1. Produce a plot with a line from your fitted model in (d) using the abline command.
plot(Taste ~ log(H2S), data = cheese, xlab = "Log H2S concentration", ylab = "Taste score")
abline(model, col = "red", lwd = 1.5)
  1. How well do you think the model and the data agree?