In forecasting, matrices are powerful tools used for organizing and analyzing data. They allow the representation of multiple relationships and variables compactly, making it easier to perform computations and apply statistical or machine learning techniques. Here’s an overview of how matrices are commonly applied in forecasting:
11.1 Linear Regression
Linear regression aims to model the relationship between input features and a target variable. In this explanation, we will explore how to express and solve linear regression problems using matrices.
11.1.1 General Form
In linear regression, the relationship between the input features and the target variable is assumed to be linear. The linear regression equation is:
Where:
is an vector of observed target values (response variable).
is an design matrix (features matrix), where each row represents an observation and each column represents a feature.
is a vector of coefficients (parameters).
is a vector of errors (residuals).
11.1.2 Matrix Representation
The matrix contains the input features. The first column of is filled with 1’s to represent the intercept . For example, for a dataset with three data points and two features:
Where: are the input values.
11.1.3 Vector of Coefficients
The vector represents the coefficients (weights) of the model, including the intercept:
Where is the intercept, and are the coefficients for the input features.
11.1.4 Target Vector
The target vector contains the observed values of the dependent variable.
11.1.5 Objective: Minimizing the Cost Function
To find the optimal coefficients , we minimize the error between the predicted values and the actual values . The error is measured using the sum of squared residuals (errors) called the cost function:
Where:
is the residual vector.
The factor is to simplify the differentiation step.
11.1.6 Minimizing the Cost Function
To minimize the cost function, we take the derivative with respect to , set it to zero, and solve for :
Set the derivative equal to zero:
Simplifying:
Now, solve for by multiplying both sides by (assuming is invertible):
This is the closed-form solution for linear regression, also known as the normal equation.
11.1.7 Making Predictions
Once is computed, predictions can be made for the target variable using:
11.1.8 Assumptions of Linear Regression
For the linear regression model to be meaningful, certain assumptions are typically made:
Linearity: The relationship between the input features and the target variable is linear.
Independence: The residuals (errors) are independent.
Homoscedasticity: The variance of residuals is constant across all observations.
Normality of Errors: The residuals follow a normal distribution (important for hypothesis testing and confidence intervals).
11.2 6. Example in R
Here’s an example in R of computing using the closed-form solution and making predictions:
import numpy as np# Sample Data (X and y)X = np.array([[1, 1, 4], # Design matrix (including intercept column of 1's) [1, 2, 5], [1, 3, 6]])y = np.array([5, 7, 9]) # Actual target values# Compute the coefficients using the Normal Equation with pseudo-inverseX_transpose = X.T # Transpose of XX_transpose_X = X_transpose.dot(X) # X^T XX_transpose_y = X_transpose.dot(y) # X^T y# Use the pseudo-inverse in case X^T X is singularbeta = np.linalg.pinv(X_transpose_X).dot(X_transpose_y)# Display the coefficients (beta values)print("Coefficients (beta):", beta)# Make predictionsy_hat = X.dot(beta) # Predicted target valuesprint("Predicted values (y_hat):", y_hat)
A Markov Chain is a mathematical model that describes a system undergoing transitions from one state to another, where the probability of moving to the next state depends only on the current state (not past states). This property is called the Markov property.
In the context of Linear Algebra, Markov Chains can be analyzed using matrices, particularly the transition matrix, to understand how the system evolves over time.
11.3.1 State Vectors
In a Markov Chain, the system’s state at any given time is represented by a state vector. This vector consists of probabilities of being in each possible state.
For example, if a system has two states, Hujan (H) and Cerah (C), the state vector could be:
Where is the probability of the system being in state H, and is the probability of the system being in state C.
11.3.2 Transition Matrix
The transition matrix describes the probabilities of transitioning between states in the system. It is a square matrix where the element represents the probability of transitioning from state to state .
For a two-state system with Hujan and Cerah, the transition matrix might look like:
In this example:
The probability of staying in state H (Hujan to Hujan) is 0.7.
The probability of transitioning from Hujan to Cerah is 0.3.
The probability of transitioning from Cerah to Hujan is 0.4.
The probability of staying in state C (Cerah to Cerah) is 0.6.
11.3.3 Matrix Multiplication
To compute the state of the system at the next time step, you multiply the current state vector by the transition matrix.
If the current state vector is , the state vector at the next time step, , is given by:
For example, if , then:
This results in:
11.3.4 Steady State
A crucial concept in Markov Chains is the steady state or stationary distribution, where the system reaches a point where the state probabilities no longer change over time.
Mathematically, the steady state vector satisfies the equation:
To find the steady state, you need to solve for the eigenvector corresponding to eigenvalue of the transition matrix. The steady-state vector is the distribution where the system remains unchanged after one application of the transition matrix.
11.3.5 Eigenvectors and Eigenvalues
The steady state of a Markov Chain can be determined by finding the eigenvector corresponding to the eigenvalue 1 of the transition matrix , since at steady state the state vector doesn’t change when multiplied by the transition matrix.
To summarize, in a Markov Chain:
The transition matrix describes the system’s transition probabilities.
The state vector updates over time by multiplying it by the transition matrix.
The steady state vector is the eigenvector associated with eigenvalue 1, representing the system’s long-term probabilities of being in each state.
11.3.6 Example Problem: Finding Steady State
Let’s take the transition matrix:
To find the steady-state vector, solve:
which translates to solving the system of equations to find the vector that does not change after multiplication with the matrix .
Markov Chains in Linear Algebra make use of key concepts such as matrices, vectors, and eigenvalues to model systems that evolve probabilistically. By applying matrix operations and finding eigenvectors corresponding to eigenvalue 1, we can describe long-term behavior and steady states in such systems.