5 Exercise 3 - The taste of cheese
Data: (\(y_i\), \(x_i\)) \(i = 1,...,30\)
Model: \(\mathbb{E}(Y_i) = \alpha + \beta x_i\), \(Var(Y_i) = \sigma^2\)
Context: This model might be considered for an experiment involving the chemical constituents of cheese and its taste. The dataset contains the concentrations of acetic acid, hydrogen sulphide (\(H_2S\)) and lactic acid, as well as a subjective taste score. It is of interest to investigate the effects of the different acids on the taste score.
Dataset: cheese.csv
Columns:
C1:
Case
- Number of sample C2:
Taste
- Taste score C3:
Acetic.Acid
- Acetic acid concentration C4:
H2S
- \(H_2S\) concentration C5:
Lactic.Acid
- Lactic acid concentrationYou can read in the data using:
cheese <- read.csv("cheese.csv")
5.1 Exploratory analysis
- Produce scatterplots of
Taste
(\(y\)) againstLactic.Acid
(\(x\)), andTaste
(\(y\)) againstH2S
(\(x\)).
# taste vs lactic acid
plot(Taste ~ Lactic.Acid, data = cheese, xlab = "Lactic acid concentration", ylab = "Taste score")
# taste vs H2S
plot(Taste ~ H2S, data = cheese, xlab = "H2S concentration", ylab = "Taste score")
- Now plot
Taste
against log(H2S
), and against log(Lactic.Acid
). The command inR
to perform a natural logarithmic transform is, for example,log(H2S)
.
# taste vs lactic acid
plot(Taste ~ log(Lactic.Acid), data = cheese, xlab = " Log lactic acid concentration", ylab = "Taste score")
# taste vs H2S
plot(Taste ~ log(H2S), data = cheese, xlab = "Log H2S concentration", ylab = "Taste score")
- Which of the 4 variables (
H2S
, log(H2S
),Lactic.Acid
, log(Lactic.Acid
)) seems best for describing a linear relationship withTaste
?
Which of the four plots shows a straight/close-to-straight line similar to the line \(y=x\)?
5.2 Fitting a model
- Fit a linear regression (using the
lm
command) withTaste
as the response variable and the explanatory variable you selected from part (c). Make a note of the fitted model.
<- lm(Taste ~ log(H2S), data = cheese) model
- Produce a plot with a line from your fitted model in (d) using the
abline
command.
plot(Taste ~ log(H2S), data = cheese, xlab = "Log H2S concentration", ylab = "Taste score")
abline(model, col = "red", lwd = 1.5)
- How well do you think the model and the data agree?