# Chapter 3 Exercises

## 3.1 Exercise 1 - Cars

**Context**: A sample of 86 cars had their city fuel economy measured in miles per gallon and for each car
the size of the engine, horsepower (an indication of how powerful the engine is) and the length
of the wheelbase (distance between the centres of the front and rear wheels) were also recorded.
It is of interest to model city fuel economy using the engine size, horsepower and wheelbase as
predictors.

**Data: cars.csv**

Read in the data using:

`cars <- read.csv("cars.csv")`

- Use an appropriate exploratory analysis to explore the relationships between cmpg, engine size, horse power and wheel base. Is there anything that might concern you here?

The `pairs()`

function may help in performing an exploratory analysis.

The relationships appear to be fairly , with the exception of the relationships between and the other variables. This may potentially be fixed by performing a transformation.

- Fit a multiple linear regression model to the data in order to predict cmpg from engine size, horse power and wheel base.

Fit a multiple linear regression model and then test to see if removing any of the variables would improve the model.

- Examine the assumptions of the selected model and comment on the model fit.

You can use `plot()`

on the model you created to get a series of diagnostic plots that help with examining model assumptions.

The residuals and they .

The residuals in the normal Q-Q plot to the line.

The histogram of the model residuals bell-shaped.

- Using an appropriate regression model, find and interpret a 95%
**confidence interval**for the logarithm of city fuel economy of a future car with:Engine Size = 2, HorsePower = 65, Wheelbase = 95

The function `predict()`

in `R`

may come in useful.

The lower end of the interval is *(to 4 decimal places)*.

The upper end of the interval is *(to 4 decimal places)*.

```
<- lm(log(cmpg) ~ ., data = cars)
model.cars
predict(model.cars, newdata = data.frame(EngSize=2, HorsePow=65, Wheelbase=95), interval = "confidence")
```

```
## fit lwr upr
## 1 3.284275 3.229901 3.338648
```

- Using the same regression model, find and interpret a 95%
**prediction interval**for the logarithm of city fuel economy of a future car with:Engine Size = 2, HorsePower = 65, Wheelbase = 95

The function `predict()`

in `R`

with the argument `interval = "predict"`

may come in useful.

The lower end of the interval is *(to 4 decimal places)*.

The upper end of the interval is *(to 4 decimal places)*.

```
<- lm(log(cmpg) ~ ., data = cars)
model.cars
predict(model.cars, newdata = data.frame(EngSize=2, HorsePow=65, Wheelbase=95), interval = "predict")
```

```
## fit lwr upr
## 1 3.284275 3.083691 3.484858
```