Section 10 Factor Analysis

Factor Analysis can be viewed as an extension of Multivariate Regression (determining a mean effect) and Principal Component Analysis (dimension reduction). However it seeks to answer a different question: whether the data is consistent with groups of highly correlated variables being modelled as one by introducing new “common factor” variables (Johnson, Wichern, and others (2014), Section 9.1, page 481).

Below I have reproduced some of the key results for the Orthogonal Factor Model. In keeping with Johnson, Wichern, and others (2014), I do not consider Factor Models with Oblique/non-orthogonal factors.

Analysis

The orthogonal factor model has a similar form to a Multivariate Linear Regression Model. The matrix of loadings $L$ performs the role of the model parameters $\beta$ , the vector $F$ performs the role of the independent/explanatory variables and $\mu$ the role of the intercept term in the design matrix.

The key difference is that the “explanatory variables” or “factor scores” $F$ are not directly observable and are treated as a random variable. This change results in additional assumptions being made. Namely that the components of the Factor Scores are have zero expectation and are uncorrelated with the error terms and also between themselves.

(see Johnson, Wichern, and others (2014), Section 9.2, page 483)

Definition 10.1 (Orthogonal Factor Model with m Common Factors) $\underset{(p \times 1)}{X}=\underset{(p \times 1)}{\mu}+\underset{(p \times m)}{L}\underset{(m \times 1)}{F}+\underset{(p \times 1)}{\epsilon}$

where $\mu_{i}$ is the mean of variable i, $\epsilon_{i}$ is the ith specific factor, $F_{j}$ the jth common factor and $l_{ij}$ the loading of ith variable on jth factor.

The unobservable random vectors $F$ and $\epsilon$ are assumed satisfy the following conditions

$F$ and $\epsilon$ are independent
$E(F)=0$
$Cov(F)=I$
$E(\epsilon)=0$
$Cov(\epsilon)=\Psi$

where $\Psi$ is a diagonal matrix

$\square$

Similar to Principal Component Analysis, in factor analysis we choose the number of Factor score components $m$ to be less than the dimensions of our observable Random Vector $p$ . In doing so, we attempt to approximate the $p(p+1)/2$ components of the Covariance matrix with only the $pm$ components of $L$ plus the $p$ diagonal componetns of $\Psi$ . The result shown in proposition 10.1 holds exactly only when $m=p$ under the Orthogonal Factor Model:

(see Johnson, Wichern, and others (2014), Section 9.2, page 484)

Proposition 10.1 (Covariance Structure for the Orthogonal Factor Model)

$\begin{align} Var(X_{i})&=l_{i1}^2+....+l_{im}^2+\psi_{i}\\ Cov(X_{i},X_{k})&=l_{i1}l_{k1}+....+l_{im}l_{im}\\ Cov(X_{i},F_{j})&=l_{ij} \end{align}$

$\square$

It turns out that Factor Loadings are only unique up to an orthogonal rotation. The purpose of such a rotation is often to improve the interpretation of the Factors:

(see Johnson, Wichern, and others (2014), Section 9.2, page 487)

Proposition 10.2 (Uniqueness of Factor Loadings) Factor loadings

$L$ are determined only up to an orthogonal matrix

$T$ . Thus the loadings

$L^*=LT$ and

$L$ both give the same representation. Furthermore the communalities, given by the diagonal elements of

$LL^´=(L^*)(L^*)´$ , are also unaffacted by the choice of T.

$\begin{align} Var(X_{i})&=l_{i1}^2+....+l_{im}^2+\psi_{i}\\ Cov(X_{i},X_{k})&=l_{i1}l_{k1}+....+l_{im}l_{im}\\ Cov(X_{i},F_{j})&=l_{ij} \end{align}$

$\square$

Estimation

To motivate the Principal Component Solution to the Factor Model in Proposition 10.3, we note that the spectral decomposition result in Proposition 8.1, allows a decomposition of the covariance matrix which fits the factor analysis model. Namely by setting $L:=[\sqrt{\lambda_{1}}e_{1}|\sqrt{\lambda_{2}}e_{2}| \cdots |\sqrt{\lambda_{p}}e_{p}]$ :

$\underset{(p \times p)}{\Sigma}=\lambda_{1}e_{1}e_{1}^´ + \lambda_{2}e_{2}e_{2}^´+\cdots+\lambda_{p}e_{p}e_{p}^´=[\sqrt{\lambda_{1}}e_{1}|\sqrt{\lambda_{2}}e_{2}| \cdots |\sqrt{\lambda_{p}}e_{p}] \begin{bmatrix}\sqrt{\lambda_{1}}e_{1}^´\\ \sqrt{\lambda_{2}}e_{2}^´\\\cdots\\ \sqrt{\lambda_{p}}e_{p}^´ \end{bmatrix} =\underset{(p \times p)}{L}\underset{(p \times p)}{L}^´ + \underset{(p \times p)}{0}$

The Principal Component Solution to the Orthogonal Factor Model is given below (see Johnson, Wichern, and others (2014), Section 9.3, page 490)

Proposition 10.3 (Principal Component Solution to the Factor Model) The principal component factor analysis of the sample covariance matrix $S$ is specified in terms of the eigenvalue-eigenvector pairs $(\widehat{\lambda_{1}},\widehat{e_{1}})$ , $(\widehat{\lambda_{2}},\widehat{e_{2}})$ ,…, $(\widehat{\lambda_{p}},\widehat{e_{p}})$ where $\widehat{\lambda_{1}} \geq \widehat{\lambda_{2}} \geq...\geq \widehat{\lambda_{p}}$ . Let m < p be the number of common factors, then the matrix of common factor loadings $(\widetilde{l_{ij}})$ is given by: $\widetilde{L}=[\sqrt{\widehat{\lambda_{1}}}\widehat{e_{1}} | \sqrt{\widehat{\lambda_{2}}}\widehat{e_{2}}|...|\sqrt{\widehat{\lambda_{m}}}\widehat{e_{m}}]$

The estimated specific variances are provided by the diagonal elements of the matrix $S-\widetilde{L}\widetilde{L}^´$ , so:

$\widetilde{\Psi}=\begin{bmatrix}\widetilde{\Psi_{1}} & 0 & ...& 0\\0 & \widetilde{\Psi_{2}} &... & 0\\\vdots &\vdots&....&\vdots\\0 & 0 &....&\widetilde{\Psi_{p}}\end{bmatrix}$

with $\widetilde{\Psi_{i}}=s_{ii}-\sum_{j=1}^{m}\widetilde{l}_{ij}^2$ .

Communalities are estimated as: $\widetilde{h_{i}^2}=\widetilde{l_{i1}^2}+\widetilde{l_{i2}^2}+...+\widetilde{l_{im}^2}$

References

Johnson, Richard Arnold, Dean W Wichern, and others. 2014. Applied Multivariate Statistical Analysis. Vol. 4. Prentice-Hall New Jersey.