## 14.3 Confirmatory Factor Analysis

**Confirmatory Factor Analysis** (**CFA**) is a popular SEM method in which one specifies how observed variables relate to assumed latent variables (Thompson 2004). CFA is often used to evaluate the psychometric properties of questionnaires or other assessments. It allows researchers to determine if the variables they assess indeed measure one or more latent variables, and how these latent variables relate to each other. For frequently used questionnaires, there are often many studies which report the correlations between the different items of the questionnaires. Such data can be used for meta-analytic SEM, which allows us to **evaluate which latent variable (or factor) structure** is the **most appropriate** based on all available evidence.

In this example, let us assume that we want to confirm the latent factor structure of a **questionnaire for sleep problems**. The questionnaire is assumed to measure two distinct latent variables, which together characterize sleep problems: **insomnia** and **lassitude** (in fact Koffel and Watson (2009) argue that sleep complaints do indeed follow this structure). For our questionnaire, we have created a fictitious dataset, `dat.cfa`

(based on the `Digman97`

data in the `metaSEM`

package) which contains 14 independent studies which report the correlations between the different symptoms of sleep complaints that our questionnaire measures: **sleep quality**, **sleep latency**, **sleep efficiency**, **daytime dysfunction** and **hypersomnia** (i.e., sleeping too much; the dataset can be downloaded here). We assume that the first three symptoms are related, because they all measure insomnia as a latent variable, whereas daytime dysfunction and hypersomnia are related because they are symptoms of the lassitude factor.

**The proposed structure represented as a graphical model looks like this** (please note that the labels are somewhat idiosyncratic to make identifying the relevant components of the model easier later on):

### 14.3.1 Setting up our data

Let us first have a look at the data we want to use for model fitting. The `dat.cfa`

dataset i will use here has a **special structure**: it is a so-called `list`

, containing (1) another `list`

of matrices and (2) a vector. A list is a very versatile *R* object, which allows to bind together different objects in one single big object. Lists can be accessed like `data.frame`

s by using the `$`

operator. By using the `names()`

function, we can see the names of the objects in the list.

`names(dat.cfa)`

`## [1] "data" "n"`

We see that the list contains two elements, our `data`

and the `n`

(sample size) of each study. The `data`

object is itself a list, so we can get the names of its contents using the `names()`

function, or can display single contents of it through the `$`

operator.

`names(dat.cfa$data)`

```
## [1] "Coleman et al. (2003)" "Salazar et al. (2008)"
## [3] "Newman et al. (2016)" "Delacruz et al. (2009)"
## [5] "Wyatt et al. (2002)" "Pacheco et al. (2016)"
## [7] "Riggs et al. (2015)" "Guerrero et al. (2010)"
## [9] "Brewer et al. (2007)" "Bridges et al. (1999)"
## [11] "Esparza et al. (2019)" "Charles et al. (1999)"
## [13] "Cooke et al. (2000)" "Ponce et al. (2017)"
```

`dat.cfa$data$`Coleman et al. (2003)``

```
## Quality Latency Efficiency DTDysf HypSomnia
## Quality 1.00 0.62 0.41 -0.48 0.00
## Latency 0.62 1.00 0.59 -0.10 0.35
## Efficiency 0.41 0.59 1.00 0.27 0.41
## DTDysf -0.48 -0.10 0.27 1.00 0.37
## HypSomnia 0.00 0.35 0.41 0.37 1.00
```

We see that the `data`

list contains 14 elements for each of the 14 included studies. A closer look at the `Coleman et al. (2003)`

study reveals that the data are stored as **correlation matrices** for our five observed variables. The Coleman et al. (2003) study contains correlation data for all variable combinations; however we can also allow for some studies to have missings (coded as `NA`

) on some of the fields **because the meta-analytic SEM approach can handle missing data** at least to some extent.

Before we proceed, let us quickly show how you can construct such a list yourself. Let us assume that you have the correlation matrices stored in separate `data.frame`

s, which you imported into R (Chapter 3). The important part here is that the **column names**, and their **order** should already be **the same** in all of the `data.frames`

. Let us say i have two data frames containing the correlation data, `df1`

and `df2`

, which look like this:

`df1`

`df2`

I can transform these `data.frames`

into a matrix using the `as.matrix()`

function. Because we want the rows and columns to contain the names of the variables, i have to specify the **dimension names** (`dimnames`

) of the matrices. The first dimension represents the rows, the second the columns. I can do this in *R* like this:

```
# Convert to matrices
mat1 <- as.matrix(df1)
mat2 <- as.matrix(df2)
# Set the dimension names
dimnames(mat1)[[1]] <- dimnames(mat1)[[2]] <- c("Quality", "Latency", "Efficiency", "DTDysf", "HypSomnia")
dimnames(mat2)[[1]] <- dimnames(mat2)[[2]] <- c("Quality", "Latency", "Efficiency", "DTDysf", "HypSomnia")
# Print the matrices
mat1
## Quality Latency Efficiency DTDysf HypSomnia
## Quality 1.00 0.62 0.41 -0.48 0.00
## Latency 0.62 1.00 0.59 -0.10 0.35
## Efficiency 0.41 0.59 1.00 0.27 0.41
## DTDysf -0.48 -0.10 0.27 1.00 0.37
## HypSomnia 0.00 0.35 0.41 0.37 1.00
mat2
## Quality Latency Efficiency DTDysf HypSomnia
## Quality 1.00 0.39 0.53 -0.30 -0.05
## Latency 0.39 1.00 0.59 0.07 0.44
## Efficiency 0.53 0.59 1.00 0.09 0.22
## DTDysf -0.30 0.07 0.09 1.00 0.45
## HypSomnia -0.05 0.44 0.22 0.45 1.00
```

We can then bind these matrices together in a `list`

, and then give the list elements a name using the `names()`

function.

```
matrix.list <- list(mat1, mat2)
names(matrix.list) <- c("Study1", "Study2")
```

To do the modeling, we also need the total sample size \(N\) of each study. We only have to create a numeric vector which contains the sample sizes in the **same order** as the objects in our list. We can then create one big list, containing both the list with our correlation matrix data, and the sample sizes for each study, again using the `list()`

function.

```
n <- c(205, 830)
all.data <- list(matrix.list, n)
```

That is it! We can now proceed to specifying our model for the `dat.cfa`

data.

### 14.3.2 Model Specification

To specify our CFA model, we will have to use the **RAM specification** and the **TSSEM** procedure we mentioned before (Chapter 13.1). The `metaSEM`

package directly follows the TSSEM approach; in fact, it even has two different functions for the two stages, `tssem1`

and `tssem2`

. The first pools our correlation matrices across all studies, and the second fits the proposed model to the data.

#### 14.3.2.1 Stage 1

At the first stage, we pool the correlation matrices using the `tssem1()`

function. There are 3-4 important parameters we have to specify in the function.

Parameter | Description |
---|---|

Cov | A list of correlation matrices we want to pool. Please note that all correlation matrices in the list need to have an identical structure. |

n | A numeric vector containing the sample sizes of each study, in the same order as the matrices contained in Cov. |

method | This specifies if we want to use a fixed-effect model (‘FEM’) or random-effects model (‘REM’). |

RE.type | When a random-effects model is used, this specifies which of the random-effects should be estimated. The default is ‘Symm’, which estimates all tau-squared values, including the covariances. When set to ‘Diag’, only the diagonal elements of the random-effects matrix are estimated. This means that we assume that the random effects are independent. Although using ‘Diag’ is a simplification of our model, it is often preferable, because less parameters have to be estimated. This particularly makes sense when the number of variables is high and/or the number of studies is low. |

For this step, i will assume a random-effects model, and use `RE.type = "Diag"`

. I will save the model as `cfa1`

, and then call the `summary()`

function on it to retrieve the output. Here is the code for that:

```
cfa1 <- tssem1(dat.cfa$data,
dat.cfa$n,
method="REM",
RE.type = "Diag")
summary(cfa1)
```

```
Call:
meta(y = ES, v = acovR, RE.constraints = Diag(paste0(RE.startvalues,
"*Tau2_", 1:no.es, "_", 1:no.es)), RE.lbound = RE.lbound,
I2 = I2, model.name = model.name, suppressWarnings = TRUE,
silent = silent, run = run)
95% confidence intervals: z statistic approximation
Coefficients:
Estimate Std.Error lbound ubound z value Pr(>|z|)
Intercept1 0.38971904 0.05429257 0.28330757 0.49613052 7.1781 7.068e-13 ***
Intercept2 0.43265876 0.04145136 0.35141559 0.51390194 10.4377 < 2.2e-16 ***
Intercept3 0.04945626 0.06071079 -0.06953470 0.16844722 0.8146 0.41529
Intercept4 0.09603706 0.04478711 0.00825593 0.18381818 2.1443 0.03201 *
Intercept5 0.42724237 0.03911621 0.35057601 0.50390873 10.9224 < 2.2e-16 ***
Intercept6 0.11929310 0.04106204 0.03881299 0.19977321 2.9052 0.00367 **
Intercept7 0.19292421 0.04757961 0.09966988 0.28617853 4.0548 5.018e-05 ***
Intercept8 0.22690159 0.03240893 0.16338126 0.29042192 7.0012 2.538e-12 ***
Intercept9 0.18105563 0.04258855 0.09758361 0.26452765 4.2513 2.126e-05 ***
Intercept10 0.43614968 0.03205961 0.37331400 0.49898536 13.6043 < 2.2e-16 ***
Tau2_1_1 0.03648373 0.01513055 0.00682838 0.06613907 2.4113 0.01590 *
Tau2_2_2 0.01963097 0.00859789 0.00277942 0.03648253 2.2832 0.02242 *
Tau2_3_3 0.04571437 0.01952285 0.00745030 0.08397845 2.3416 0.01920 *
Tau2_4_4 0.02236122 0.00995083 0.00285794 0.04186449 2.2472 0.02463 *
Tau2_5_5 0.01729073 0.00796404 0.00168149 0.03289996 2.1711 0.02992 *
Tau2_6_6 0.01815482 0.00895896 0.00059557 0.03571407 2.0264 0.04272 *
Tau2_7_7 0.02604880 0.01130265 0.00389601 0.04820160 2.3047 0.02119 *
Tau2_8_8 0.00988761 0.00513713 -0.00018098 0.01995620 1.9247 0.05426 .
Tau2_9_9 0.01988243 0.00895052 0.00233973 0.03742514 2.2214 0.02633 *
Tau2_10_10 0.01049222 0.00501578 0.00066148 0.02032297 2.0918 0.03645 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Q statistic on the homogeneity of effect sizes: 1220.333
Degrees of freedom of the Q statistic: 130
P value of the Q statistic: 0
Heterogeneity indices (based on the estimated Tau2):
Estimate
Intercept1: I2 (Q statistic) 0.9326
Intercept2: I2 (Q statistic) 0.8872
Intercept3: I2 (Q statistic) 0.9325
Intercept4: I2 (Q statistic) 0.8703
Intercept5: I2 (Q statistic) 0.8797
Intercept6: I2 (Q statistic) 0.8480
Intercept7: I2 (Q statistic) 0.8887
Intercept8: I2 (Q statistic) 0.7669
Intercept9: I2 (Q statistic) 0.8590
Intercept10: I2 (Q statistic) 0.8193
Number of studies (or clusters): 14
Number of observed statistics: 140
Number of estimated parameters: 20
Degrees of freedom: 120
-2 log likelihood: -112.4196
OpenMx status1: 0 ("0" or "1": The optimization is considered fine.
Other values may indicate problems.)
```

A look at the `OpenMx status1`

shows the the model estimates are trustworthy. To make the results more easily digestable, we can extract the **fixed effects** (our estimated pooled correlations) using the `coef()`

function. We then make a symmetrical matrix out of the coefficients using `vec2symMat()`

, and add the dimension names for easier interpretation.

```
# Extract the fixed coefficients (correlations)
fixed.coefs <- coef(cfa1, "fixed")
# Make a symmetric matrix
fc.mat <- vec2symMat(fixed.coefs, diag = FALSE)
# Label rows and columns
dimnames(fc.mat)[[1]] <- dimnames(fc.mat)[[2]] <- c("Quality", "Latency", "Efficiency", "DTDysf", "HypSomnia")
fc.mat
```

```
## Quality Latency Efficiency DTDysf HypSomnia
## Quality 1.00000000 0.3897190 0.4326588 0.04945626 0.09603706
## Latency 0.38971904 1.0000000 0.4272424 0.11929310 0.19292421
## Efficiency 0.43265876 0.4272424 1.0000000 0.22690159 0.18105563
## DTDysf 0.04945626 0.1192931 0.2269016 1.00000000 0.43614968
## HypSomnia 0.09603706 0.1929242 0.1810556 0.43614968 1.00000000
```

We can now see the **pooled correlation matrix** for our our variables. Looking back the model output, we also see that all correlation coefficients are **significant** (\(p<0.05\)), except one: the **correlation between sleep quality and daytime dysfunction** was not significant. From the perspective of our model, this makes sense, because we expect these variables to load on different factors. We also find that the \(I^2\) values for the different estimates are very, very large (\(76\% - 93\%\)). We may therefore also have a look if a model fitted in different subclusters of studies might reduce the amount of heterogeneity we find, and if this translates to different SEM fits on stage 2.

#### 14.3.2.2 Stage 2

After pooling the correlation matrices, it is now time to determine if our proposed factor model does indeed fit the data. To specify our model, we have to resort to the **RAM** formulation this time, and specify the \(\mathbf{A}\), \(\mathbf{S}\) and \(\mathbf{F}\) matrices. To fill the fields, it is often easier to construct an empty matrix first. In the rows and columns, the matrices will not only contain the observed variables, but also the **latent variables** we want to estimate, `f_Insomnia`

and `f_Lassitude`

. Here is how we can create a zero matrix as a blueprint:

```
# Create vector of column/row names
dims <- c("Quality", "Latency", "Efficiency", "DTDysf", "HypSomnia", "f_Insomnia", "f_Lassitude")
# Create 7x7 matrix of zeros
mat <- matrix(rep(0, 7*7), nrow = 7, ncol = 7)
# Label the rows and columns
dimnames(mat)[[1]] <- dimnames(mat)[[2]] <- dims
mat
```

```
## Quality Latency Efficiency DTDysf HypSomnia f_Insomnia
## Quality 0 0 0 0 0 0
## Latency 0 0 0 0 0 0
## Efficiency 0 0 0 0 0 0
## DTDysf 0 0 0 0 0 0
## HypSomnia 0 0 0 0 0 0
## f_Insomnia 0 0 0 0 0 0
## f_Lassitude 0 0 0 0 0 0
## f_Lassitude
## Quality 0
## Latency 0
## Efficiency 0
## DTDysf 0
## HypSomnia 0
## f_Insomnia 0
## f_Lassitude 0
```

\[~\]

**\(\mathbf{A}\) Matrix**

In the \(\mathbf{A}\) matrix, we specify the asymmetrical (i.e. single) arrows in our model. The logic, again, is that the arrow starts at the **column variable** and ends where the column meets with the entry of the **row variable**. All other fields which do not represent arrows are filled with `0`

. We specify that an arrow has to be “estimated” by providing a character string. This character string starts with a **starting value** for the optimiztion procedure (usually somewhere between \(0.1\) and \(0.3\)) followed by `*`

. After the `*`

symbol, we specify a **label** for the value. If two fields in the \(\mathbf{A}\) matrix have the same label, this means that we assume that the fields have the **same value**. We assume a starting value of \(0.3\) for all estimated arrows, and label the fields according to the graph from above. We can do this like this:

```
A <- matrix(c(0, 0, 0, 0, 0, "0.3*Ins_Q", 0 ,
0, 0, 0, 0, 0, "0.3*Ins_L", 0 ,
0, 0, 0, 0, 0, "0.3*Ins_E", 0 ,
0, 0, 0, 0, 0, 0 , "0.3*Las_D",
0, 0, 0, 0, 0, 0 , "0.3*Las_H",
0, 0, 0, 0, 0, 0 , 0 ,
0, 0, 0, 0, 0, 0 , 0
), nrow = 7, ncol = 7, byrow=TRUE)
# Label columns and rows
dimnames(A)[[1]] <- dimnames(A)[[2]] <- dims
A
```

```
## Quality Latency Efficiency DTDysf HypSomnia f_Insomnia
## Quality "0" "0" "0" "0" "0" "0.3*Ins_Q"
## Latency "0" "0" "0" "0" "0" "0.3*Ins_L"
## Efficiency "0" "0" "0" "0" "0" "0.3*Ins_E"
## DTDysf "0" "0" "0" "0" "0" "0"
## HypSomnia "0" "0" "0" "0" "0" "0"
## f_Insomnia "0" "0" "0" "0" "0" "0"
## f_Lassitude "0" "0" "0" "0" "0" "0"
## f_Lassitude
## Quality "0"
## Latency "0"
## Efficiency "0"
## DTDysf "0.3*Las_D"
## HypSomnia "0.3*Las_H"
## f_Insomnia "0"
## f_Lassitude "0"
```

Looks good so far. The last step is to plug the `A`

matrix into the `as.mxMatrix()`

function to make it usable for the later step.

`A <- as.mxMatrix(A)`

\[~\]

**\(\mathbf{S}\) Matrix**

In the \(\mathbf{S}\) matrix, we specify the variances we want to estimate. We want to estimate the variance for all observed variables, as well as the correlation between our latent variables. We set the correlation of our latent variables with themselves to be \(1\). We use a starting value of \(0.2\) for the variances in the observed variables, and \(0.3\) for the correlations. Here is how we construct the matrix:

```
# Make a diagonal matrix for the variances
Vars <- Diag(c("0.2*var_Q", "0.2*var_L", "0.2*var_E", "0.2*var_D", "0.2*var_H"))
# Make the matrix for the latent variables
Cors <- matrix(c(1, "0.3*cor_InsLas",
"0.3*cor_InsLas", 1),
nrow=2, ncol=2)
# Combine
S <- bdiagMat(list(Vars, Cors))
# Label columns and rows
dimnames(S)[[1]] <- dimnames(S)[[2]] <- dims
S
```

```
## Quality Latency Efficiency DTDysf HypSomnia
## Quality "0.2*var_Q" "0" "0" "0" "0"
## Latency "0" "0.2*var_L" "0" "0" "0"
## Efficiency "0" "0" "0.2*var_E" "0" "0"
## DTDysf "0" "0" "0" "0.2*var_D" "0"
## HypSomnia "0" "0" "0" "0" "0.2*var_H"
## f_Insomnia "0" "0" "0" "0" "0"
## f_Lassitude "0" "0" "0" "0" "0"
## f_Insomnia f_Lassitude
## Quality "0" "0"
## Latency "0" "0"
## Efficiency "0" "0"
## DTDysf "0" "0"
## HypSomnia "0" "0"
## f_Insomnia "1" "0.3*cor_InsLas"
## f_Lassitude "0.3*cor_InsLas" "1"
```

And again, we transform the matrix using `as.mxMatrix()`

.

`S <- as.mxMatrix(S)`

\[~\]

**\(\mathbf{F}\) Matrix**

The \(\mathbf{F}\) matrix is quite easily specified: in the diagonal elements of **observed variables**, we fill in `1`

. Elsewhere we specify `0`

. In the \(\mathbf{F}\) matrix, we **only select the rows in which at least on element is not zero**.

```
# Construct diagonal matrix
F <- Diag(c(1, 1, 1, 1, 1, 0, 0))
# Only select non-null rows
F <- F[1:5,]
# Specify row and column labels
dimnames(F)[[1]] <- dims[1:5]
dimnames(F)[[2]] <- dims
F
```

```
## Quality Latency Efficiency DTDysf HypSomnia f_Insomnia
## Quality 1 0 0 0 0 0
## Latency 0 1 0 0 0 0
## Efficiency 0 0 1 0 0 0
## DTDysf 0 0 0 1 0 0
## HypSomnia 0 0 0 0 1 0
## f_Lassitude
## Quality 0
## Latency 0
## Efficiency 0
## DTDysf 0
## HypSomnia 0
```

`F <- as.mxMatrix(F)`

\[~\]

**Model fitting**

Now, it is time to fit our proposed model to the data. To do this, we use the `tssem2()`

function. We only have to provide the **stage 1 model** `cfa1`

, the three matrices, and specify `diag.constraints=FALSE`

, because we are not fitting a mediation model. I save the resulting object as `cfa2`

and then access it using `summary()`

.

```
cfa2 <- tssem2(cfa1,
Amatrix = A,
Smatrix = S,
Fmatrix = F,
diag.constraints = FALSE)
summary(cfa2)
```

```
##
## Call:
## wls(Cov = pooledS, aCov = aCov, n = tssem1.obj$total.n, Amatrix = Amatrix,
## Smatrix = Smatrix, Fmatrix = Fmatrix, diag.constraints = diag.constraints,
## cor.analysis = cor.analysis, intervals.type = intervals.type,
## mx.algebras = mx.algebras, model.name = model.name, suppressWarnings = suppressWarnings,
## silent = silent, run = run)
##
## 95% confidence intervals: z statistic approximation
## Coefficients:
## Estimate Std.Error lbound ubound z value
## Las_D 0.679955 0.075723 0.531541 0.828370 8.9795
## Ins_E 0.760455 0.061964 0.639009 0.881901 12.2726
## Las_H 0.641842 0.072459 0.499825 0.783859 8.8580
## Ins_L 0.590630 0.052649 0.487439 0.693821 11.2182
## Ins_Q 0.569435 0.052425 0.466684 0.672187 10.8619
## cor_InsLas 0.377691 0.047402 0.284785 0.470596 7.9679
## Pr(>|z|)
## Las_D < 0.00000000000000022 ***
## Ins_E < 0.00000000000000022 ***
## Las_H < 0.00000000000000022 ***
## Ins_L < 0.00000000000000022 ***
## Ins_Q < 0.00000000000000022 ***
## cor_InsLas 0.000000000000001554 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Goodness-of-fit indices:
## Value
## Sample size 4496.0000
## Chi-square of target model 7.8204
## DF of target model 4.0000
## p value of target model 0.0984
## Number of constraints imposed on "Smatrix" 0.0000
## DF manually adjusted 0.0000
## Chi-square of independence model 501.6766
## DF of independence model 10.0000
## RMSEA 0.0146
## RMSEA lower 95% CI 0.0000
## RMSEA upper 95% CI 0.0297
## SRMR 0.0436
## TLI 0.9806
## CFI 0.9922
## AIC -0.1796
## BIC -25.8234
## OpenMx status1: 0 ("0" or "1": The optimization is considered fine.
## Other values indicate problems.)
```

We again see that the `OpenMx status1`

is `0`

, meaning that the optimization worked fine. We get the estimates for the different paths between the latent variables and the observed symptoms, such as \(0.68\) for \(Lassitude \rightarrow Daytime Dysfunction\) (`Las_D`

). The important part, however, it to check **how well the assumed model fits our data**. We can have a look at the second, third and fourth row of the `Goodness-of-fit indices`

, where we see that the **goodness of fit** test is \(\chi^{2}_{4,4496} = 7.82\), which is **not significant** \(p=0.098\). On the other hand, we see that the **Root Mean Square Error of Approximation** (**RMSEA**; Steiger and Lind (1980)) value is \(RMSEA=0.0146\). As a rule of thumb, a model can be considered to fit the data well when the \(RMSEA\) is close to \(0.05\) and smaller than \(0.10\). The indices for this model are therefore somewhat **conflicting**, but generally indicate that the model may **not fit all of the data closely**. A potential way to explore this would be to conduct a **subgroup analysis**, in which the model is fitted to subgroups of studies which share some kind of characteristic (e.g. age group, country of origin, etc.) to see if there might be subclusters in which the model fits better than others.

Please be aware that a common problem in SEM studies is that researchers often only focus on their **one proposed model**, and if it fits the data well. If it is found that the model has a close fit for the data, it is often directly assumed that the data prove the underlying structure or theory. This is **problematic**, because for the same data, more than one model can achieve a good fit at the same time. It is therefore necessary to also check for **alternative model hypotheses** and structures. If the alternative model fits the data well too, it becomes less clear if our proposed structure really is the “correct” one.

#### 14.3.2.3 Plotting the Model

After we fit the model, `metaSEM`

makes it quite easy for us to visualize it graphically. To do this, we first have to install and load the `semPlot`

package (Epskamp 2019).

`library(semPlot)`

To plot the model, we have to **convert** it into a format that `semPlot`

can use. We can use the `meta2semPlot()`

function to do this.

`cfa.plot <- meta2semPlot(cfa2)`

We can then use the `semPaths()`

function in `semPlot`

to generate the plot. This function has many parameters, which you can access by typing `?semPaths`

into the Console, and then hitting Enter. Here is how my code looks like, and the resulting plot:

```
# Create Plot labels (left to right, bottom to top)
labels <- c("Sleep\nQuality","Sleep\nLatency","Sleep\nEfficiency","Daytime\nDysfunction",
"Hyper-\nsomnia","Insomnia", "Lassitude")
# Plot
semPaths(cfa.plot,
whatLabels = "est",
edge.color = "black",
nodeLabels = labels)
```

### References

Thompson, Bruce. 2004. *Exploratory and Confirmatory Factor Analysis: Understanding Concepts and Applications.* American Psychological Association.

Koffel, Erin, and David Watson. 2009. “The Two-Factor Structure of Sleep Complaints and Its Relation to Depression and Anxiety.” *Journal of Abnormal Psychology* 118 (1). American Psychological Association: 183.

Steiger, James H, and John C Lind. 1980. “Statistically Based Tests for the Number of Common Factors, Paper Presented at the Annual Meeting of the Psychometric Society.” *Iowa City, IA*.

Epskamp, Sacha. 2019. *SemPlot: Path Diagrams and Visual Analysis of Various Sem Packages’ Output*. https://CRAN.R-project.org/package=semPlot.