# Chapter 7 Multiple Factor Analysis

Data Table MFA is used to handle multiple tables, that have different variables measuring the same observations, or same set of variables on different set of observations.

Goal:

1. To analyze several variables/ data tables measured on the same observations

2. To provide a set of common factor scores (compromise factor scores)

3. To project each of the original data sets onto the compromise to analyze commonalities and discrepancies.

Note: The number of the variables used to describe the observations can vary from one data table to anoother, but the observations should be the same in all the data tables.

Key ideas

• Normalization: The general idea behind MFA is to normalize each of the individual data sets so that their first principal component has the same length (as measured by the first singular value of each data table).

NOTE This step is performed to make sure the analysis is not driven by a single table for example in case of unidimensional data.

• Combining the tables: data tables into a common representation of the observations sometimes called the compromise. This compromise is obtained from a (non-normalized) PCA of the grand table obtained from the concatenation of the normalized data tables.

Interpretation

``````1. The second PCA on the normalized concatinated data decomposes the data to
give a set of new orthoginal variables.

2. Global Factor Scores
The coordinates of the observations on the components are called factor
scores and these can be used to plot maps of the observations in which
the observations are represented as points such that the distances in the
map best reflect the similarities between the observations.

3. Partial Factor Scores
The positions of the observations ‘as seen by’ each data set are called
partial factor scores and can be also represented as points in the
compromise map. The average of the factor scores of all the tables give
back the factor score of the compromise.   ``````

## 7.1 Datasets

Survey of Autobiographical Memory(SAM) + Object-Spatial Imagery Questionnaire(OSIQ) + Big Five Inventory(BFI)

Rows: The observation is common across the two tables. 144 Participants asked to rate the extent to which a particular item applied to their memory in general, using a 5-point Likert scale.

SAM Columns: 26 quantitative variables

``````Episodic Memory: 8 questions (Memory for personal experiences)
Semantic Memory : 6 questions (Memory for general knowledge and facts)
Spatial Memory : 6 questions (Memory testing how well one can navigate)
Prospective Memory : 6 questions (How clearly one can picture future events)``````

OSIQ Columns: 30 quantitative variables.

``````Object imagery scale : 15 questions(processing of colorful, pictorial, and
high-resolution images of individual objects)

Spatial imagery scale : 15 questions (processing of schematic images, spatial
relations amongst objects, and spatial transformations)``````

BFI Columns : 44 quantitative variables

Extraversion : 8 questions Agreeable: 9 questions Conscientiousness: 9 questions Neuroticism: 8 questions Openness: 10 questions

..

``````# colour for rows
col4row <- data1_active\$memoryGroups
col4row <- recode(col4row,
Norm = 'orange2',
High = 'darkred'
)

#color for colums

E <- c("#5f9ea0","#5f9ea0","#5f9ea0","#5f9ea0","#5f9ea0","#5f9ea0","#5f9ea0","#5f9ea0")
S<- c("#305ABF","#305ABF","#305ABF","#305ABF","#305ABF","#305ABF")
P <- c("#76ee00","#76ee00","#76ee00","#76ee00","#76ee00","#76ee00")
Fu <- c("#ff7f24", "#ff7f24", "#ff7f24", "#ff7f24", "#ff7f24", "#ff7f24")
col4Var1 <- as.matrix(c(E,S,P,Fu))

object <- 15
ob <-rep("gold", each = object)

spatial<- 15
sp <-rep("olivedrab4", each = spatial)

col4Var2 <- as.matrix(c(ob,sp))

Extrovert<- 8
Ex<- rep("royalblue2", each = Extrovert)

Agreeableness <- 9
Ag<-rep("indianred1", each = Agreeableness )

Conscientiousness <- 9
Co<-rep("violetred4", each = Conscientiousness)

Neuroticism <- 8
Ne<- rep("#b68365", each =Neuroticism)

Openness <- 10
Op<- rep("mediumpurple3" , each =Openness)

col4Var3<- as.matrix(c(Ex,Ag,Co,Ne,Op))

col4combined<- as.matrix(c(col4Var1,col4Var2,col4Var3))``````
``````res.mfa <- mpMFA(data_MFA,
column.design = t(col.design),
make.columndesign.nominal = TRUE,
DESIGN = row.design,
make.design.nominal = TRUE,
graphs = FALSE )``````
``````## [1] "Preprocessed the Rows of the data matrix using:  None"
## [1] "Preprocessed the Columns of the data matrix using:  Center_1Norm"
## [1] "Preprocessed the Tables of the data matrix using:  MFA_Normalization"
## [1] "Preprocessing Completed"
## [1] "Optimizing using:  None"
## [1] "Processing Complete"``````

## 7.2 Looking at the data Pattern

### 7.2.1 RV heat map:

RV matrix tells us the correlation between the three data tables. The RV coefficient varies between 0 and 1 and reflects the amount of variance shared the tables. SAM and OSIQ are more similar with higher correlation than with BFI.

``````rv<- res.mfa\$mexPosition.Data\$InnerProduct\$RVMatrix
rownames(rv)<-c("SAM","OSIQ","BFI")
colnames(rv)<-rownames(rv)

corrplot(rv,  method = "color", tl.cex = 2, tl.col = c("pink3","seagreen","mediumorchid4"),
addCoef.col = "white", number.digits = 2, number.cex = 1,
cl.pos = 'b', cl.cex = .5,
col = colorRampPalette(c("darkred", "white","midnightblue"))(20)) ``````

#### 7.2.1.1 Correlation Plot

SAM and OSIQ show stronger correlation with each other than BFI. The Blue blocks along the diagonal shows how the questions are correlated with each other. The black lines separate the tables.

• Episodic and Future memory questions(SAM) have stronger positive correlation with object imagery questions(OSIQ) and also show positive correlation with openness , extroversion and agreeableness questions(BFI)

• Neurotism(BFI) seems strongly negatively correlated with extroversion (BFI), however it is also slightly negatively correlated with all other set of questions.

``````XYZ.cor <- cor(data_MFA)

corrplot(XYZ.cor, method = "color",tl.col = col4combined,tl.cex = .3, cl.pos ='b',
col = colorRampPalette(c("darkred", "white","midnightblue"))(20)
)

lines(c(26.5, 26.5), c(0,100 ), lwd=1, lty=1)
lines(c(56.5, 56.5), c(0,100), lwd=1, lty=1)
lines(c(0,100),c(44.5, 44.5), lwd=1, lty=1)
lines(c(0,100),c(74.5, 74.5), lwd=1, lty=1)``````

## 7.3 MFA analysis

NOTE:

MFA Weights applied to each table = 1/( first singular value after performing PCA on that table). Weight is the scaler value applied such that all the three tables equally contribute towards the dimension 1.

``````Eig.tab <- res.mfa\$mexPosition.Data\$Compromise\$compromise.eigs
Alpha <- 1/sqrt(Eig.tab)

weight <- Alpha

plot.weights <-barplot(weight, main= "Weights",
col = as.vector(c("pink3","seagreen","mediumorchid4") ),
xlab= " data tables")``````

### 7.3.1 Scree Plot

Note: Dimension 3 and 4 form an eigen plane so it is better to look at these dimensions together.

There are no significaant dimensions in the plot because there is no permutation test been developed yet. Inference battery is not ready for MFA .

``````PlotScree(ev =res.mfa\$mexPosition.Data\$Table\$eigs,
title = "MFA Scree Plot",
plotKaiser = TRUE,
)``````

### 7.3.2 Row Factor scores:

#### 7.3.2.1 Global Factor Scores

Dimension 1 represents the seperation of the High Autobiographical memory group from the low which transitions from one group to another along the axis. The means are significantly separate from each other.

``````#Global row factor scores
Fi <-res.mfa\$mexPosition.Data\$Table\$fi

col4means<- gplots::col2hex(c( 'darkred', 'orange3'))

# Labels for Inertia
label4Map.mfa <- createxyLabels.gen(1,2,
lambda = res.mfa\$mexPosition.Data\$Table\$eigs,
tau = res.mfa\$mexPosition.Data\$Table\$t)

#MFA I-set map Dimension(1 and 2)
baseMap.i <- PTCA4CATA::createFactorMap( Fi,
title = "MFA global Factor Scores",
col.points = col4row,
display.labels = FALSE,
alpha.points = .4
)

aggMap.i <- baseMap.i\$zeMap_background + baseMap.i\$zeMap_dots + label4Map.mfa

aggMap.i``````

##### 7.3.2.1.1 With Means

There is a separation between the means of the group. A gradation of means can be observed from High to Normal across dimension 1.

``````grp.ind <- order(data1_active\$memoryGroups)[!duplicated(sort(data1_active\$memoryGroups))]
grp.col <- col4row
grp.name <- data1_active\$memoryGroups[grp.ind] # get the corresponding groups
names(grp.col) <- grp.name

group.mean <- aggregate(Fi,
by = list(data1_active\$memoryGroups),
# must be a list
mean
)

rownames(group.mean) <- group.mean[,1] # Use the first column as row names
fi.mean <- group.mean[,-1] # Exclude the first column

fi.mean.plot <- createFactorMap(fi.mean,
alpha.points = 1,
display.labels = TRUE,
col.points = col4means,
col.labels = col4means,
pch = 17,
cex = 3,
text.cex = 4
)

aggMap.i.withMeans <- aggMap.i + label4Map.mfa +fi.mean.plot\$zeMap_dots + fi.mean.plot\$zeMap_text``````
##### 7.3.2.1.2 With Confidence interval

The confidence interval are concise and segregated depicting that the gorup means are significiantly different.

``````## MFA Confidence Intervals - Bootstrap for CI

BootCube <- PTCA4CATA::Boot4Mean(res.mfa\$mexPosition.Data\$Table\$fi,
design = row.design,
niter = 100,
suppressProgressBar = TRUE)

# Create the ellipses using function MakeCIEllipses from package PTCA4CATA
GraphElli <- PTCA4CATA::MakeCIEllipses(BootCube\$BootCube[,1:2,],
names.of.factors = c("Dimension 1","Dimension 2"),
col = col4means,
p.level = .95
)

# create the I-map with Observations, means and confidence intervals
Map.I.withCI <-  aggMap.i.withMeans +  GraphElli + label4Map.mfa + ggtitle(" Global Row Factor Scores with Confidence Intervals")

Map.I.withCI``````

``````Map.I.withCI <- recordPlot()

aggMap.i.withMeans``````

``aggMap.i.withMeans<- recordPlot()``

#### 7.3.2.2 Partial factor scores

With Means

SAM is least distant from the mean than BFI and OSIQ

``````F_j<- res.mfa\$mexPosition.Data\$Table\$partial.fi.array

res.mfa\$mexPosition.Data\$InnerProduct\$alphaWeights``````
``````##           [,1]      [,2]      [,3]
## [1,] 0.3333333 0.3333333 0.3333333``````
``````Eig.tab <- res.mfa\$mexPosition.Data\$Compromise\$compromise.eigs
alpha_j <- 1/sqrt(Eig.tab)

data_tables<- col.design
code4Groups<- unique(data_tables)
nK<- length(code4Groups)

F_k <- array(0, dim = c(dim(F_j)[[1]], dim(F_j)[[2]],nK))
dimnames(F_k) <- list(dimnames(F_j)[[1]], dimnames(F_j)[[2]], code4Groups)

alpha_k <- rep(0, nK)
names(alpha_k) <- code4Groups
Fa_j <- F_j

# A horrible loop
for (j in 1:dim(F_j)[[3]]){ Fa_j[,,j] <- F_j[,,j] * alpha_j[j] }

# Another horrible loop
for (k in 1:nK){
# lindex <- data_tables == code4Groups[k]
alpha_k[k] <- alpha_j[k]
F_k[,,k] <- (1/alpha_k[k])*apply(Fa_j[,,k],c(1,2),sum)

}

meanfk <-
apply(F_k, c(2,3), FUN = function(x){
aggregate(x, by = list(row.design), mean)\$x
})
dim(meanfk)  ``````
``## [1]   2 100   3``
`````` mean.plot <- createFactorMap(fi.mean,
constraints = minmaxHelper4Partial(fi.mean, meanfk, axis1 = 1 ,axis2 = 2) ,
alpha.points = 1,
display.labels = TRUE,
col.points = col4means,
col.labels = col4means,
pch = 17,
cex = 3,
text.cex = 4
)

Fi.meanonly.plot<- mean.plot\$zeMap_background+mean.plot\$zeMap_dots + mean.plot\$zeMap_text+ label4Map.mfa

Fi.meanonly.plot``````

``````pf.means <- createPartialFactorScoresMap(
factorScores = fi.mean,
partialFactorScores = meanfk,
axis1 = 1, axis2 = 2,
colors4Items = as.vector(col4means),
colors4Blocks = c("green3", "maroon4","midnightblue"),
names4Partial = c("SAM", "OSIQ","BFI"), #
font.labels = 'bold',
size.labels = 5,
)

plot.pFi.mean <- Fi.meanonly.plot + label4Map.mfa + pf.means\$mapColByItems
plot.pFi.block <- Fi.meanonly.plot + label4Map.mfa +pf.means\$mapColByBlocks
plot.pFi.mean``````

``plot.pFi.block``

### 7.3.3 Column Factor scores

These interpretations are made with the help of the following contributuion barplots

Dim1: Better the memory for personal experiences and future memory , more openness and better object imagery

Dim2: Better the spatial memory, better is their Spatial imagery more the Extroversion and contrasts with Neuroticism

Dim 3&4: Better the Spatial Memory, Better is the spatial imagery more likely to be extroverted and show openness to new situations whereas these participants are likely to have poorer Episodic Memory and so is their Conscientiousness, Agreeableness.

``````Q <- res.mfa\$mexPosition.Data\$Table\$Q

label4Map.mfa <- createxyLabels.gen(1,2,
lambda = res.mfa\$mexPosition.Data\$Table\$eigs,
tau = res.mfa\$mexPosition.Data\$Table\$t)

baseMap.j <- createFactorMap(Q, #constraints = constraints.sym,
col.points = col4combined,
col.labels = col4combined,
display.labels = TRUE,
display.points = TRUE,
text.cex = 3,
force=2,
cex =2,

#lines4J <- addLines4MCA(Fj, col4Var = col4Levels.imp\$color4Variables, size = 1)

``````Loadings_12<- recordPlot()

label4Map.mfa2 <- createxyLabels.gen(3,4,
lambda = res.mfa\$mexPosition.Data\$Table\$eigs,
tau = res.mfa\$mexPosition.Data\$Table\$t)

baseMap.j.2 <- PTCA4CATA::createFactorMap(Q, axis1 = 3, axis2 = 4,
col.points   =  gplots::col2hex(col4combined),
alpha.points =  .8,
alpha.labels = .8,
col.labels   =  gplots::col2hex(col4combined),
force = 5,
cex = 2,
text.cex = 3,

# A graph for the J-set 3 and 4

### 7.3.4 Column Contribution

``````Fj<- res.mfa\$mexPosition.Data\$Table\$Q
ctrJ<- res.mfa\$mexPosition.Data\$Table\$cj
signed.ctrJ <- ctrJ * sign(Fj)

plotctrJ.1 <- PrettyBarPlot2(
bootratio = round(100*signed.ctrJ[,1]),
threshold = 100 / nrow(signed.ctrJ),
ylim = NULL,
color4bar = gplots::col2hex(col4combined),
color4ns = "gray75",
plotnames = TRUE,
main = 'Column Contributions Dim 1.',
ylab = "Signed Contributions",
signifOnly = TRUE,
font.size = 5)

plotctrJ.1``````

``````plotctrJ.2 <- PrettyBarPlot2(
bootratio = round(100*signed.ctrJ[,2]),
threshold = 100 / nrow(signed.ctrJ),
ylim = NULL,
color4bar = gplots::col2hex(col4combined),
color4ns = "gray75",
plotnames = TRUE,
main = 'Column Contributions Dim 2.',
ylab = "Signed Contributions",
signifOnly = TRUE,
font.size = 5)

plotctrJ.2 ``````

``````plotctrJ.3 <- PrettyBarPlot2(
bootratio = round(100*signed.ctrJ[,3]),
threshold = 100 / nrow(signed.ctrJ),
ylim = NULL,
color4bar = gplots::col2hex(col4combined),
color4ns = "gray75",
plotnames = TRUE,
main = 'Column Contributions Dim 3.',
ylab = "Signed Contributions",
signifOnly = TRUE,
font.size = 5)

plotctrJ.3``````

``````plotctrJ.4 <- PrettyBarPlot2(
bootratio = round(100*signed.ctrJ[,4]),
threshold = 100 / nrow(signed.ctrJ),
ylim = NULL,
color4bar = gplots::col2hex(col4combined),
color4ns = "gray75",
plotnames = TRUE,
main = 'Column Contributions Dim 4.',
ylab = "Signed Contributions",
signifOnly = TRUE,
font.size = 5)

plotctrJ.4``````

## 7.4 Conclusion:

Dim1 Participants under the High AM group are likely to have better memory for personal experiences and future memory, as well as they tend to show more openness and have a preference towards object imagery.

Dim2 Better the spatial memory, better is their Spatial imagery more the Extroversion and contrasts with Neuroticism.

Dim 3&4 Better the Spatial Memory, Better is the spatial imagery more likely to be extroverted and show openness to new situations whereas these participants are likely to have poorer Episodic Memory and so is their Conscientiousness, Agreeableness.