10 Day 10 (June 16)
10.2 Maximum Likelihood Estimation
- Maximum likelihood estimation for the linear model
- Assume that
- Write out the likelihood function
- Maximize the likelihood function
- First take the natural log of , which gives
- Recall that
- Maximize the log-likelihood function
- Visual for data and
- Three ways to do it in program R
- Using matrix calculus and algebra
# Data <- c(0.16,2.82,2.24) y <- matrix(c(1,1,1,1,2,3),nrow=3,ncol=2,byrow=FALSE) X <- length(y) n # Maximum likelihood estimation using closed from estimators <- solve(t(X)%*%X)%*%t(X)%*%y beta <- (1/n)*t(y-X%*%beta)%*%(y-X%*%beta) sigma2 beta
## [,1] ## [1,] -0.34 ## [2,] 1.04
sigma2
## [,1] ## [1,] 0.5832
- Using modern (circa 1970’s) optimization techniques
# Maximum likelihood estimation estimation using Nelder-Mead algorithm <- c(0.16,2.82,2.24) y <- matrix(c(1,1,1,1,2,3),nrow=3,ncol=2,byrow=FALSE) X <- function(par){ negloglik <- par[1:2] beta <- par[3] sigma2 -sum(dnorm(y,X%*%beta,sqrt(sigma2),log=TRUE)) }optim(par=c(0,0,1),fn=negloglik,method = "Nelder-Mead")
## $par ## [1] -0.3397162 1.0398552 0.5831210 ## ## $value ## [1] 3.447978 ## ## $counts ## function gradient ## 150 NA ## ## $convergence ## [1] 0 ## ## $message ## NULL
- Using modern and user friendly statistical computing software
# Maximum likelihood estimation using lm function # Note: estimate of sigma2 is not the MLE! <- data.frame(y = c(0.16,2.82,2.24),x = c(1,2,3)) df <- lm(y~x,data=df) m1 coef(m1)
## (Intercept) x ## -0.34 1.04
summary(m1)$sigma^2
## [1] 1.7496
summary(m1)
## ## Call: ## lm(formula = y ~ x, data = df) ## ## Residuals: ## 1 2 3 ## -0.54 1.08 -0.54 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -0.3400 2.0205 -0.168 0.894 ## x 1.0400 0.9353 1.112 0.466 ## ## Residual standard error: 1.323 on 1 degrees of freedom ## Multiple R-squared: 0.5529, Adjusted R-squared: 0.1057 ## F-statistic: 1.236 on 1 and 1 DF, p-value: 0.4663
- Live example
10.3 Confidence intervals for paramters
Example
<- c(63, 68, 61, 44, 103, 90, 107, 105, 76, 46, 60, 66, 58, 39, 64, 29, 37, y 27, 38, 14, 38, 52, 84, 112, 112, 97, 131, 168, 70, 91, 52, 33, 33, 27, 18, 14, 5, 22, 31, 23, 14, 18, 23, 27, 44, 18, 19) <- 1965:2011 year <- data.frame(y = y, year = year) df plot(x = df$year, y = df$y, xlab = "Year", ylab = "Annual count", main = "", col = "brown", pch = 20, xlim = c(1965, 2040))
- Is the population size really decreasing?
- Write out the model we should use answer this question
- How can we assess the uncertainty in our estimate of ?
- Confidence intervals in R
<- lm(y~year,data=df) m1 coef(m1)
## (Intercept) year ## 2356.48797 -1.15784
confint(m1,level=0.95)
## 2.5 % 97.5 % ## (Intercept) 929.80699 3783.1689540 ## year -1.87547 -0.4402103
- Note and let where and . In R we can extract using
vcov()
.
vcov(m1)
## (Intercept) year ## (Intercept) 501753.2825 -252.3792372 ## year -252.3792 0.1269513
Note that
diag(vcov(m1))^0.5
## (Intercept) year ## 708.3454542 0.3563023
corresponds to the column
Std. Error
fromsummary()
summary(m1)
## ## Call: ## lm(formula = y ~ year, data = df) ## ## Residuals: ## Min 1Q Median 3Q Max ## -45.333 -20.597 -9.754 14.035 117.929 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 2356.4880 708.3455 3.327 0.00176 ** ## year -1.1578 0.3563 -3.250 0.00219 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 33.13 on 45 degrees of freedom ## Multiple R-squared: 0.1901, Adjusted R-squared: 0.1721 ## F-statistic: 10.56 on 1 and 45 DF, p-value: 0.00219
- When is known
- When is estimated where
- Deriving the confidence interval for
- What is ?
- “By hand” in R
<- length(df$y) - 2 nu <- qt(p = 0.025,df = nu) a a
## [1] -2.014103
<- qt(p = 0.975,df = nu) b b
## [1] 2.014103
<- coef(m1)[2] beta1.hat <- (diag(vcov(m1))^0.5)[2] sigma.beta1.hat - b*sigma.beta1.hat beta1.hat
## year ## -1.87547
- a*sigma.beta1.hat beta1.hat
## year ## -0.4402103
confint(m1)[2,]
## 2.5 % 97.5 % ## -1.8754696 -0.4402103