D.32 Answers: Regression

Answers to exercises in Sect. 35.13.

Answer to Exercise 35.1: 1. \(b_0 = 97.499\) (the intercept); \(b_1=0.0764\) (the slope). 2. \(\hat{y} = 97.499 + 0.0764x\): \(x\) is the inlet temperature (in \(^\circ\)C) and \(y\) is removal efficiency (in %). 3. When inlet temperature increases by 1 degree C, on average the removal efficiency increases by 0.076 percentage points. 4. \(H_0\): \(\beta=0\); \(H_1\): \(\beta\ne 0\) (two-tailed;RQ implies two-tailed test). 5. \(t = 10.742\), which is huge; \(P<0.001\). 6. \(0.076\pm (2\times 0.007)\), or \(0.076\pm 0.014\), or \(0.062\) to \(0.090\).
Answer to Exercise 35.2: 1. Intercept not about 110; that’s where the line ‘stops,’ but the intercept is the predicted value of \(y\) when \(x=0\). We have to extend the line quite a bit. Using rise-over-run, guess slope is \((190-110)/(180-110) =1.14\). 2. \(\hat{y} = -3.69 + 1.04x\), where \(y\) is punting distance (in feet), and \(x\) is right leg strength (in pounds). 3. For each extra pound of leg strength, the punting distance increases, on average, by about 1 foot. 4. \(H_0\): \(\beta=0\); \(H_1\): \(\beta\ne 0\). (You could answer in terms of correlations.) The question is stated as a two-tailed question, but testing if stronger legs increase kicking distance seems sensible. 5. \(t = 6.16\), which is huge; \(P = 0.0001\) (two-tailed). 6. \(1.0427 \pm (2\times 0.1692)\), or \(1.0427 \pm 0.3384\), or 0.70 to 1.4. 7. Very strong evidence in the sample (\(t = 6.16\); \(P = 0.0001\) (two-tailed)) that punting distance is related to leg strength (slope: \(1.0427\); \(n=13\)).
Answer to Exercise 35.3: 1. Way too many decimal places. \(r\) is not relevant as relationship is non-linear. 2. Regression is inappropriate: the relationship is non-linear. 3. \(y\) should be \(\hat{y}\); the slope and intercept have been swapped (from the plot, the intercept for their line is about 0.4, which they give as the slope). 4. The whole thing is as dodgy-as…
Answer to Exercise 35.4: 1. \(\hat{y} = 17.47 - 2.59x\), where \(x\) is the percentage bitumen by weight, and \(y\) is the percentage air voids by volume. 2. Slope: an increase in the bitumen weight by one percentage point decreases the average percentage air voids by volume by 2.59 percentage points. Intercept: dodgy (extrapolation); in principle 0% bitumen content by weight, the percentage air voids by volume is about 17.47%. 3. \(t=-74.9\): Massive! Extremely strong evidence (\(P<0.001\)) of a relationship. 4. \(\hat{y} = 17.4712 - (2.5937\times 5) = 4.5027\), or about 4.5%. Expected good prediction, as relationship is strong. 5. \(\hat{y} = 17.4712 - (2.5937\times 6) = 1.909\), or about 1.9%. Might be a poor prediction, since this is extrapolation.
Answer to Exercise 35.5: 1. \(b_0\): When someone spends no time on sunscreen application, an average of 0.27g has been applied; nonsense. \(b_1\): Each extra minute spent on application adds an average of 2.21g of sunscreen: sensible. 2. The value of \(\beta_0\) could be zero… which would make sense. 3. \(\hat{y} = 0.27 + (2.21\times 8) = 17.95\); an average of about 18g. 4. About 64% of the variation in sunscreen amount applied can be explained by the variation in the time spent on application. 5. \(r = \sqrt{0.64} = 0.8\), and need a positive value of \(r\). A strong and positive correlation between the variables.
Answer to Exercise 35.6: 1. No. 2. Possibly; no idea of accuracy of predictions really. 3. Intercept: Weight of infant with chest circumference zero; silly. Slope: average increase in birth weight (in g) for each increase in chest circumference by one cm. 4. Intercept: cm; slope: cm/gram. 5. \(\hat{y}=2538.7\)g. 6. Too many decimal place! Regression equation implies predicting to 0.0001 of a gram. \(r\) has too many decimal places too.