3 Day 3 (June 7)

3.1 Announcements

Meet-and-greet tomorrow
A comment about email help!
If office hours times don’t work for you let me know
Recommended reading
- Chapters 1 and 2 (pgs 1 - 28) in Linear Models with R
- Chapter 2 in Applied Regression and ANOVA Using SAS

Matrices

X <- matrix(c(1,2,3,4,5,6),nrow=3,ncol=2,byrow=FALSE)
X

##      [,1] [,2]
## [1,]    1    4
## [2,]    2    5
## [3,]    3    6

Matrix inversion

solve(t(X)%*%X)

##            [,1]       [,2]
## [1,]  1.4259259 -0.5925926
## [2,] -0.5925926  0.2592593

Determinant of a matrix

I <- diag(1,3)
I

##      [,1] [,2] [,3]
## [1,]    1    0    0
## [2,]    0    1    0
## [3,]    0    0    1

det(I)

## [1] 1

Quadratic form
- $\mathbf{y}^{'}\mathbf{S}\mathbf{y}$
Derivative of a quadratic form (Note $\mathbf{S}$ is a symmetric matrix; e.g., $\mathbf{X}^{'}\mathbf{X}$ )
- $\frac{\partial}{\partial\mathbf{y}}\mathbf{y^{'}\mathbf{S}\mathbf{y}}=2\mathbf{S}\mathbf{y}$
Other useful derivatives
- $\frac{\partial}{\partial\mathbf{y}}\mathbf{\mathbf{x^{'}}\mathbf{y}}=\mathbf{x}$
- $\frac{\partial}{\partial\mathbf{y}}\mathbf{\mathbf{X^{'}}\mathbf{y}}=\mathbf{X}$

What is a model?
What is a linear model?
- Most widely used model in science, engineering, and statistics
- Vector form: $\mathbf{y}=\beta_{0}+\beta_{1}\mathbf{x}_{1}+\beta_{2}\mathbf{x}_{2}+\ldots+\beta_{p}\mathbf{x}_{p}+\boldsymbol{\varepsilon}$
- Matrix form: $\mathbf{y}=\mathbf{X}\boldsymbol{\beta}+\boldsymbol{\varepsilon}$
- Which part of the model is the mathematical model
- Which part of the model makes the linear model a “statistical” model
- Visual
Which of the four below are a linear model $\mathbf{y}=\beta_{0}+\beta_{1}\mathbf{x}_{1}+\beta_{2}\mathbf{x}^{2}_{1}+\boldsymbol{\varepsilon}$ $\mathbf{y}=\beta_{0}+\beta_{1}\mathbf{x}_{1}+\beta_{2}\text{log(}\mathbf{x}_{1}\text{)}+\boldsymbol{\varepsilon}$ $\mathbf{y}=\beta_{0}+\beta_{1}e^{\beta_{2}\mathbf{x}_{1}}+\boldsymbol{\varepsilon}$ $\mathbf{y}=\beta_{0}+\beta_{1}\mathbf{x}_{1}+\text{log(}\beta_{2}\text{)}\mathbf{x}_{1}+\boldsymbol{\varepsilon}$
Why study the linear model?
- Building block for more complex models (e.g., GLMs, mixed models, machine learning, etc)
- We know the most about it

Three options to estimate $\boldsymbol{\beta}$
- Minimize a loss function
- Maximize a likelihood function
- Find the posterior distribution
- Each option requires different assumptions

Define a measure of discrepancy between the data and the mathematical model
- Find the values of $\boldsymbol{\beta}$ that make $\mathbf{X}\boldsymbol{\beta}$ “closest” to $\mathbf{y}$
- Visual
Classic example $\underset{\boldsymbol{\beta}}{\operatorname{argmin}}\sum_{i=1}^{n}(y_i-\mathbf{x}_{i}^{\prime}\boldsymbol{\beta})^2$ or in matrix form $\underset{\boldsymbol{\beta}}{\operatorname{argmin}}(\mathbf{y} - \mathbf{X}\boldsymbol{\beta})^{\prime}(\mathbf{y} - \mathbf{X}\boldsymbol{\beta})$ which results in $\hat{\boldsymbol{\beta}}=(\mathbf{X}^{\prime}\mathbf{X})^{-1}\mathbf{X}^{\prime}\mathbf{y}$ -In R
```
y <- c(0.16,2.82,2.24)
X <- matrix(c(1,1,1,1,2,3),nrow=3,ncol=2,byrow=FALSE)

solve(t(X)%*%X)%*%t(X)%*%y
```
```
##       [,1]
## [1,] -0.34
## [2,]  1.04
```
```
optim(par=c(0,0),method = c("Nelder-Mead"),fn=function(beta){t(y-X%*%beta)%*%(y-X%*%beta)})
```
```
## $par
## [1] -0.3399977  1.0399687
## 
## $value
## [1] 1.7496
## 
## $counts
## function gradient 
##       61       NA 
## 
## $convergence
## [1] 0
## 
## $message
## NULL
```
```
lm(y~X-1)
```
```
## 
## Call:
## lm(formula = y ~ X - 1)
## 
## Coefficients:
##    X1     X2  
## -0.34   1.04
```