Capítulo 12 Inferencia
Primeiro passo é criar um banco de dados hipotético
set.seed(1)
= 500
n = rnorm(n, mean = 500, sd = 50)
x1 = rpois(n, lambda = 5)
x2 = rnorm(n)
e
= c(3.5,5)
beta = 10 + beta[1]*x1 + beta[2]*x2 + e
y = cbind(1,x1,x2) X
= lm(y~ x1 + x2)
ols summary(ols)
##
## Call:
## lm(formula = y ~ x1 + x2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.15798 -0.71935 0.00887 0.70818 3.11361
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.6940067 0.4721931 20.53 <2e-16 ***
## x1 3.5002971 0.0009249 3784.62 <2e-16 ***
## x2 5.0274460 0.0212557 236.52 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.044 on 497 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 7.25e+06 on 2 and 497 DF, p-value: < 2.2e-16
confint(ols,level = 0.95)
## 2.5 % 97.5 %
## (Intercept) 8.766266 10.621748
## x1 3.498480 3.502114
## x2 4.985684 5.069208
= solve(t(X)%*%X)%*%(t(X)%*%y)
B B
## [,1]
## 9.694007
## x1 3.500297
## x2 5.027446
Calcular a estatística \(T\): \[t_n(\theta) = \frac{\hat \theta - \theta}{s(\hat \theta)}\]
Considerar \(\theta = 0\)
É preciso calcular \(s(\hat\theta)\)
#y_hat = X%*%B
= y - X%*%B
e_hat = ncol(X)
k = (1/(n-k)) * t(e_hat)%*%e_hat # n-k correção de viés
sigma_hat sigma_hat
## [,1]
## [1,] 1.090167
= sigma_hat[1,1] * solve(t(X)%*%X)
V = sqrt(diag(V))
ep ep
## x1 x2
## 0.472193136 0.000924874 0.021255675
= B / ep
t t
## [,1]
## 20.52975
## x1 3784.62042
## x2 236.52253
Definir c para nível de significância de \(5\%\)
1 - (0.05/2)
## [1] 0.975
= qt(.975,n-k)
c c
## [1] 1.964749
Comparando com o verdadeiro valor de Beta
= (B[2,1] - beta[1]) / ep[2]
t t
## x1
## 0.3212361
P-valor
Assumindo que \(\beta \sim t_{n-k}\)
2 * (1 - pt(t, df = n-k))
## x1
## 0.7481665
12.0.0.1 Intervalos de Confiança
\[C_n = [\hat \theta - c .s(\hat \theta), \hat\theta + c. s(\hat\theta)]\]
Definir c para nível de significância de \(5\%\)
= qt(.975,n-k)
c c
## [1] 1.964749
Intervalo de Confiança de \(\beta_0\)
= B[1,1] - c* ep[1]
lower = B[1,1] + c * ep[1]
upper print(lower);print(upper)
##
## 8.766266
##
## 10.62175
Intervalo de Confiança de \(\beta_1\)
= B[2,1] - c* ep[2]
lower = B[2,1] + c * ep[2]
upper print(lower);print(upper)
## x1
## 3.49848
## x1
## 3.502114
Intervalo de Confiança de \(\beta_2\)
= B[3,1] - c* ep[3]
lower = B[3,1] + c * ep[3]
upper print(lower);print(upper)
## x2
## 4.985684
## x2
## 5.069208
Definir c para nível de significância de \(1\%\)
= qt(.995,n-k)
c c
## [1] 2.585758
Intervalo de Confiança de \(\beta_0\)
= B[1,1] - c* ep[1]
lower = B[1,1] + c * ep[1]
upper print(lower);print(upper)
##
## 8.47303
##
## 10.91498
Intervalo de Confiança de \(\beta_1\)
= B[2,1] - c* ep[2]
lower = B[2,1] + c * ep[2]
upper print(lower);print(upper)
## x1
## 3.497906
## x1
## 3.502689
Intervalo de Confiança de \(\beta_2\)
= B[3,1] - c* ep[3]
lower = B[3,1] + c * ep[3]
upper print(lower);print(upper)
## x2
## 4.972484
## x2
## 5.082408