7.5 Further Exercises
We will now repeat the regression, but centering the gestational weeks variable around 39 weeks. Note that we could’ve created a new variable and added it to the dataset, but instead we created a temporary variable with all the values of gestwks relative to 39 weeks.
#--- Fit a regression line with gestwks centered and store the result
mod2 <- bab9 %>% mutate(gestwks = gestwks-39) %>% lm(bweight ~ gestwks, data = .)
## Warning: package 'bindrcpp' was built under R version 3.5.3
#--- Get a summary of the regression and confidence intervals
summary(mod2)
##
## Call:
## lm(formula = bweight ~ gestwks, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1810 -285 -7 283 1248
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3193.76 17.58 181.7 <2e-16 ***
## gestwks 206.64 7.48 27.6 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 441 on 639 degrees of freedom
## Multiple R-squared: 0.544, Adjusted R-squared: 0.543
## F-statistic: 762 on 1 and 639 DF, p-value: <2e-16
confint(mod2)
## 2.5 % 97.5 %
## (Intercept) 3159 3228
## gestwks 192 221
Exercise 16.5: Which of the two coefficients has changed and why has it changed; which is the same and why is it the same? What does the estimated value of the coefficient that has changed in this model actually mean?
Exercise 16.6: Produce a scatter plot of bweight and matage. Do you think that there is a linear relationship between these variables? Estimate the regression line for these variables with bweight as the response variable and matage as the explanatory variable, then write down the estimated values of A and B and their 95% confidence intervals. What is the strength of evidence against the hypothesis that the coefficient B is zero? What is the equation of the fitted line? Display this line on the scatter plot.