## E.10 Answers to Lecture 10 tutorial

### E.10.1 Answers to Sect. 10.1

1. The correlations:

Plot 1: $$0.94$$ (correlation A);
Plot 2: $$-0.95$$ (correlation D);
Plot 3: $$0.12$$ (correlation B):
Plot 6: $$0.75$$ (correlation C).

Correlation is inappropriate for Plot 4 and Plot 5.

2. Examples of the direction in Plot 1: Any two variables moderately positively correlated, such as height and weight, distance lived from university and travel time, etc.

3. Examples of direction in Plot 2: Any two variables moderately negatively correlated, such as hours of weekly exercise and body weight, number of SCI110 tutorial missed and final mark, etc.

4. These are:

Plot 1: $$88.4$$%;
Plot 2: $$90.3$$%;
Plot 3: $$1.4$$%;
Plot 6: $$56.3$$%.

1. No answer. But a line is fine for Plot 1, Plot 2, Plot 3 (but very weak!) and Plot 6, but not for Plot 4 and Plot 6.
2. My very rough slope estimates are:
Plot 1: $$(50-10)/10 \approx 4$$;
Plot 2: $$(20 - 50)/15 \approx -2$$;
Plot 3: $$0$$;
Plot 6: $$(55 - 35)/5 = 4$$.
3. My very rough intercept estimates are:
Plot 1: $$8$$;
Plot 2: $$40$$;
Plot 3: $$32$$;
Plot 6: $$10$$.
4. My very rough estimates are:
Plot 1; $$\hat{y} = 8 + 4x$$;
Plot 2: $$\hat{y} = 40 -2x$$;
Plot 3: $$\hat{y} = 32$$;
Plot 6: $$\hat{y} = 10 + 4x$$.

### E.10.2 Answers to Sect. 10.2

If you count the dots on the scatterplot, you won’t find $$n=38$$ dots, because of overplotting. For example, there are two Corolla’s from 2006 selling for $9500. If you have time, you can ask students about this, and even ask for suggestions to improve this (such as jittering). 1. Approximately linear, negative, reasonably strong. 2. Condition, extras (air con, etc.), sedan/hatch, colour, when rego due, new/old tyres, location, etc. 3. No answer. 4. Looks to be expensive, as$15,000 would be above the line (at least for the line I’d draw).

5. Probably $3900. 6. My guess is about $$b_0\approx 17$$ or$17,000. This would mean the average price of a 2014 second-hand Corolla can be expected to be about $17,000. Note: Since all the cars in the sample are second-hand, technically the results only generalise to second-hand cars, so that $$b_0$$ really is the estimated price of a second-hand 2014 Corolla. 7. $$b_1 = (1 - 17)/(16-0)\approx -1$$. That is, the price reduces by about$1000 each year older the Corolla gets.

8. Using the above, we have $$\hat{y} = 17 - x$$ approximately. Guessing the regression line won’t, of course, produce this level of precision, so anything close-ish to this is fine.

9. $$\hat{y} = 16.54 - 0.96x$$ (jamovi) or $$\hat{y} = 16.536 - 0.963x$$ (SPSS), where $$y$$ is the price in thousands of dollars, and $$x$$ is the age in years.

10. $$r = -0.929$$, and so $$R^2 = (-0.929)^2 = 0.863$$, or about 86%, so about 86% of the variation in prices can be explained by age alone. The rest can be explained by the car’s condition, odometer reading, accessories, service history, etc.

11. jamovi: $$\hat{y} = 16.54 - (0.96\times 20) = -2.66$$, or -$2660. SPSS: $$\hat{y} = 16.536 - (0.963\times 20) = -2.72$$, or -$2720.
This is clearly silly, as we are extrapolating.

12. jamovi: $$\hat{y} = 16.54 - (0.96\times 6) = 10.78$$, or $10,780. SPSS: $$\hat{y} = 16.536 - (0.963\times 6) = 10.76$$, or$10,760.
So the price seems a bit steep, unless it is highly specified and in great condition.

13. Hardly need a test… $$H_0$$: $$\beta_1=0$$ vs $$H_1$$: $$\beta_1<0$$. From output, $$t=-15.059$$, and $$P=0.000/2=0.000$$: very strong evidence that older cars fetch lower second-hand prices, on average.

14. $$-0.963\pm(2\times 0.064)$$, or $$-0.963\pm 0.128$$, or from $$-1.091$$ to $$-0.835$$.

15. Looks the same really, just reflected left-to-right.

• Size of $$r$$ won’t change, sign from negative to positive; i.e. $$r=0.93$$.
• Value of $$R^2$$ will be the same.
• Slope will be the same except sign will change (in both cases, the values on the horizontal axis are one year apart).
• Intercept will change a lot… it is the predicted value of the price if the line is extended to year 0 (which is, of course, meaningless).

### E.10.3 Answers to Sect. 10.3

The point is that the same regression line can be associated with a high correlation, or a poor correlation.