10.3 Regression
We will be using the following data to show the regression. Note that we need to use dataframe.
x <- c(1,2,3,5,6,7,10,12,13)
y <- c(1,4,5,6,7,8,9,10,15)
z <- c(2,3,7,8,9,12,8,7,6)
df <-data.frame(x=x,y=y,z=z)
10.3.1 Simple linear regression
The following code shows simple linear regression. The syntax is lm(y~x, dataframe).
SLR <-lm(y~x,df)
summary(SLR)
##
## Call:
## lm(formula = y ~ x, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.9660 -1.2234 0.2618 0.7470 2.1627
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.5104 0.8756 1.725 0.128193
## x 0.8713 0.1134 7.686 0.000118 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.389 on 7 degrees of freedom
## Multiple R-squared: 0.8941, Adjusted R-squared: 0.8789
## F-statistic: 59.08 on 1 and 7 DF, p-value: 0.0001175
10.3.2 Muliple linear regression
The following code shows multiple regression. The syntax is lm(y~x+z, dataframe).
MLR <-lm(y~x + z,df)
summary(MLR)
##
## Call:
## lm(formula = y ~ x + z, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.8920 -1.2104 0.1329 0.8179 2.3030
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.25034 1.34526 0.929 0.388522
## x 0.85666 0.13321 6.431 0.000668 ***
## z 0.05167 0.19124 0.270 0.796060
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.492 on 6 degrees of freedom
## Multiple R-squared: 0.8953, Adjusted R-squared: 0.8605
## F-statistic: 25.67 on 2 and 6 DF, p-value: 0.001146
10.3.3 Interaction terms
The following code shows multiple regression with interaction term. The syntax is lm(y~x+z+x:z, dataframe).
MLRDummy <- lm(y ~ x + z + x:z, df)
summary(MLRDummy)
##
## Call:
## lm(formula = y ~ x + z + x:z, data = df)
##
## Residuals:
## 1 2 3 4 5 6 7 8 9
## -0.3080 0.9573 -0.3143 -0.7246 -0.1773 1.1230 -0.5891 -1.6242 1.6572
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.33212 1.96748 -0.677 0.5284
## x 1.59845 0.46581 3.432 0.0186 *
## z 0.64902 0.40027 1.621 0.1658
## x:z -0.12819 0.07789 -1.646 0.1607
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.316 on 5 degrees of freedom
## Multiple R-squared: 0.9321, Adjusted R-squared: 0.8914
## F-statistic: 22.89 on 3 and 5 DF, p-value: 0.002386
10.3.4 Robust standard error
To have robust variance-covariance matrix, we have to use package sandwich to get the appropriate error term. Then we do the estimate through the package lmtest
First we install and load the packages.
install.packages(c("sandwich","lmtest"))
library(sandwich)
library(lmtest)
Then we need to create the variance and covariance matrix: using the vcovHC().
To match the stata robust command result, we use HC1(). Then we perform coeftest based on specified variance-covariance matrix.
MLR <- lm(y ~ x + z, df)
coeftest(MLR, vcov = vcovHC(MLR, "HC1"))
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.250343 1.117886 1.1185 0.306129
## x 0.856663 0.194468 4.4052 0.004543 **
## z 0.051674 0.167447 0.3086 0.768062
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1