5 Exercise 3 - The taste of cheese

Data: (yi, xi) i=1,...,30

Model: E(Yi)=α+βxi, Var(Yi)=σ2

Context: This model might be considered for an experiment involving the chemical constituents of cheese and its taste. The dataset contains the concentrations of acetic acid, hydrogen sulphide (H2S) and lactic acid, as well as a subjective taste score. It is of interest to investigate the effects of the different acids on the taste score.

Dataset: cheese.csv

Columns:

          C1: Case - Number of sample
          C2: Taste - Taste score
          C3: Acetic.Acid - Acetic acid concentration
          C4: H2S - H2S concentration
          C5: Lactic.Acid - Lactic acid concentration

You can read in the data using:

cheese <- read.csv("cheese.csv")

5.1 Exploratory analysis

  1. Produce scatterplots of Taste (y) against Lactic.Acid (x), and Taste (y) against H2S (x).
# taste vs lactic acid
plot(Taste ~ Lactic.Acid, data = cheese, xlab = "Lactic acid concentration", ylab = "Taste score")
# taste vs H2S
plot(Taste ~ H2S, data = cheese, xlab = "H2S concentration", ylab = "Taste score")
  1. Now plot Taste against log(H2S), and against log(Lactic.Acid). The command in R to perform a natural logarithmic transform is, for example, log(H2S).
# taste vs lactic acid
plot(Taste ~ log(Lactic.Acid), data = cheese, xlab = " Log lactic acid concentration", ylab = "Taste score")
# taste vs H2S
plot(Taste ~ log(H2S), data = cheese, xlab = "Log H2S concentration", ylab = "Taste score")
  1. Which of the 4 variables (H2S, log(H2S), Lactic.Acid, log(Lactic.Acid)) seems best for describing a linear relationship with Taste?

Which of the four plots shows a straight/close-to-straight line similar to the line y=x?

5.2 Fitting a model

  1. Fit a linear regression (using the lm command) with Taste as the response variable and the explanatory variable you selected from part (c). Make a note of the fitted model.
model <- lm(Taste ~ log(H2S), data = cheese)
  1. Produce a plot with a line from your fitted model in (d) using the abline command.
plot(Taste ~ log(H2S), data = cheese, xlab = "Log H2S concentration", ylab = "Taste score")
abline(model, col = "red", lwd = 1.5)
  1. How well do you think the model and the data agree?