8 Least Squares and Applications
The Least Squares method is a foundational statistical technique used to model the relationship between variables and predict outcomes. By minimizing the sum of squared differences between observed data points and the values predicted by a model, it ensures the best fit for a given dataset. This approach is widely applied across various fields such as data analysis, engineering, economics, and machine learning.
This document explores the Least Squares method, focusing on its application in linear regression. Practical examples in Python are provided to demonstrate how to implement this method and interpret results effectively.
8.1 Least Squares Method
The Least Squares Method is a statistical technique used to find the best-fitting line through a set of data points. In the context of simple linear regression, this method is used to minimize the sum of squared differences between the observed data points and the predicted values by the model.
8.2 Linear Regression Model and Matrix Equation
In simple linear regression, we aim to find a line that best fits the data. Let’s consider we have a dataset with
Where:
is the observed value, is the predictor (independent variable), is the intercept, is the slope of the line, is the residual (error term) for each data point.
We can write this equation for all data points in a vector and matrix form as:
8.3 Finding the Coefficients Using Least Squares
In linear regression, the primary objective is to find the best-fitting line that represents the relationship between the independent variable (
Residuals (
The RSS is calculated by summing the squared residuals across all data points, which gives the formula:
Expanding this residual sum of squares (RSS) as:
Expanding this quadratic form:
where:
is a scalar resulting from the dot product of with itself. is the cross-term representing the interaction between predictors and the response. is the quadratic term involving the coefficients .
To minimize
The derivatives are:
, as is independent of . . .
Combining these:
To find the value of
Simplify:
8.4 Solving the Normal Equation
To find
This gives us the values of the coefficients
8.5 Linear Regression Example
8.5.1 Data
We have the following data:
x | y |
---|---|
1 | 2.197622 |
2 | 5.849113 |
3 | 16.793542 |
4 | 11.352542 |
5 | 13.646439 |
6 | 23.575325 |
7 | 19.304581 |
8 | 12.674694 |
9 | 17.565736 |
10 | 20.771690 |
Python can be applied to generate data as the following code:
import numpy as np
import pandas as pd
# Set seed for reproducibility
123)
np.random.seed(
# Create the data
= np.arange(1, 11)
x = 2 * x + 3 + np.random.normal(0, 5, 10)
y
# Create a DataFrame
= pd.DataFrame({'x': x, 'y': y})
data
# Display the data
print(data)
set.seed(123)
<- 1:10
x <- 2 * x + 3 + rnorm(10, mean = 0, sd = 5)
y <- data.frame(x, y)
data
# Display the data
data
x y
1 1 2.197622
2 2 5.849113
3 3 16.793542
4 4 11.352542
5 5 13.646439
6 6 23.575325
7 7 19.304581
8 8 12.674694
9 9 17.565736
10 10 20.771690
8.5.2 Linear Regression Equation
The linear regression model for this data can be written as:
Where
is the predicted value, is the input data, is the intercept, is the slope, and is the error.
8.5.3 Matrix and Vector
Create matrix
import numpy as np
# Assuming data is already defined as a pandas DataFrame
= np.column_stack((np.ones(len(data)), data['x'])) # Add a column of ones for the intercept
X = data['y'].values # Convert the 'y' column to a numpy array
y
# Display X and y
print("X:\n", X)
print("y:\n", y)
# Matrix X and vector y
<- cbind(1, data$x) # Add a column of ones for the intercept
X <- data$y
y
X
[,1] [,2]
[1,] 1 1
[2,] 1 2
[3,] 1 3
[4,] 1 4
[5,] 1 5
[6,] 1 6
[7,] 1 7
[8,] 1 8
[9,] 1 9
[10,] 1 10
y
[1] 2.197622 5.849113 16.793542 11.352542 13.646439 23.575325 19.304581
[8] 12.674694 17.565736 20.771690
8.5.4 Compute
Next, we compute
import numpy as np
# Assuming X is already defined as a numpy array
= np.dot(X.T, X)
X_t_X
# Display the result
print(X_t_X)
# Compute X'X
<- t(X) %*% X
X_t_X X_t_X
[,1] [,2]
[1,] 10 55
[2,] 55 385
8.5.5 Compute
Now, we compute
# Compute X'Y
= np.dot(X.T, y)
X_t_y
# Display the result
print(X_t_y)
# Compute X'Y
<- t(X) %*% y
X_t_y X_t_y
[,1]
[1,] 143.7313
[2,] 921.7089
8.5.6 Compute the Inverse of
To compute
# Compute the inverse of X'X
= np.linalg.inv(X_t_X)
inv_X_t_X
# Display the result
print(inv_X_t_X)
# Compute the inverse of X'X
<- solve(X_t_X)
inv_X_t_X inv_X_t_X
[,1] [,2]
[1,] 0.46666667 -0.06666667
[2,] -0.06666667 0.01212121
8.6 7. Compute the Vector
Now we can compute the vector
# Compute the beta vector
= np.dot(inv_X_t_X, X_t_y)
beta
# Display the result
print(beta)
# Compute the beta vector
<- inv_X_t_X %*% X_t_y
beta beta
[,1]
[1,] 5.627337
[2,] 1.590144
8.7 8. Linear Regression Equation
Thus, the estimated regression coefficients are:
Therefore the final regression equation is become:
8.8 Applications of Least Squares
8.8.1 Data Analysis
Predict relationships between variables (e.g., sales vs. advertising spend).
8.8.2 Physics and Engineering
Fit theoretical models to experimental data.
8.8.3 Economics and Logistics
Optimize cost and demand models.
8.8.4 Image Processing
Reduce noise in images by fitting pixel values.