Chapter 1 Introduction to confirmatory factor analysis
1.1 The principal component factor analysis approach
- PCFA attempts to account for all of the variance and covariance of the set of items rather than the portion of the covariance that the items share in common
- This may not be the best, but it is the most common
- Illistrate PCFA with NLSY97 data to construct a measure of convervatism
library(haven)
<- read_dta("nlsy97cfa.dta")
nlsy.data
summary(nlsy.data)
## r0000100 x1 x2 x3 x4
## Min. : 1 Min. :1.000 Min. :1.00 Min. :1.000 Min. :1.000
## 1st Qu.:2249 1st Qu.:2.000 1st Qu.:1.00 1st Qu.:1.000 1st Qu.:1.000
## Median :4502 Median :2.000 Median :1.00 Median :1.000 Median :1.000
## Mean :4504 Mean :2.332 Mean :1.62 Mean :1.416 Mean :1.365
## 3rd Qu.:6758 3rd Qu.:3.000 3rd Qu.:2.00 3rd Qu.:2.000 3rd Qu.:2.000
## Max. :9022 Max. :4.000 Max. :4.00 Max. :4.000 Max. :4.000
## NA's :1 NA's :7152 NA's :7126 NA's :7111 NA's :7113
## x5 x6 x7 x8
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:1.000 1st Qu.:2.000 1st Qu.:1.000 1st Qu.:1.000
## Median :2.000 Median :2.000 Median :2.000 Median :1.000
## Mean :1.773 Mean :2.277 Mean :2.229 Mean :1.309
## 3rd Qu.:2.000 3rd Qu.:3.000 3rd Qu.:3.000 3rd Qu.:2.000
## Max. :4.000 Max. :4.000 Max. :4.000 Max. :4.000
## NA's :7170 NA's :7174 NA's :7210 NA's :7110
## x9 x10 x11 x12
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:1.000 1st Qu.:1.000 1st Qu.:3.000 1st Qu.:2.000
## Median :2.000 Median :1.000 Median :3.000 Median :2.000
## Mean :1.705 Mean :1.391 Mean :3.224 Mean :2.233
## 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:4.000 3rd Qu.:3.000
## Max. :4.000 Max. :4.000 Max. :4.000 Max. :4.000
## NA's :7138 NA's :7125 NA's :1690 NA's :1588
## x13
## Min. :1.000
## 1st Qu.:3.000
## Median :4.000
## Mean :3.655
## 3rd Qu.:4.000
## Max. :4.000
## NA's :1694
# PCFA model
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.6 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.4 ✓ stringr 1.4.0
## ✓ readr 2.1.1 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(psych)
##
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
<-
pcfa %>%
nlsy.data select(x1:x10) %>%
drop_na() %>%
principal(covar = TRUE, rotate = "none", nfactors = 10)
pcfa
## Principal Components Analysis
## Call: principal(r = ., nfactors = 10, rotate = "none", covar = TRUE)
## Standardized loadings (pattern matrix) based upon correlation matrix
## PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 h2 u2 com
## x1 0.61 -0.38 -0.29 0.01 0.22 -0.07 0.55 0.22 -0.06 -0.07 1 2.2e-16 3.9
## x2 0.58 0.04 -0.61 0.10 0.22 0.20 -0.16 -0.37 0.15 -0.01 1 4.4e-16 3.6
## x3 0.72 0.21 -0.08 -0.29 0.13 -0.33 -0.11 0.07 -0.17 0.41 1 2.2e-16 3.1
## x4 0.72 0.32 -0.04 -0.17 -0.01 -0.33 -0.17 0.16 0.10 -0.42 1 5.6e-16 3.1
## x5 0.58 -0.03 -0.24 0.44 -0.59 0.02 -0.11 0.21 -0.06 0.08 1 -2.2e-16 3.7
## x6 0.61 -0.45 0.35 0.08 -0.10 -0.31 0.01 -0.24 0.36 0.08 1 -6.7e-16 4.5
## x7 0.60 -0.33 0.23 0.02 0.31 0.37 -0.36 0.32 0.02 0.01 1 1.1e-16 4.9
## x8 0.60 0.33 0.13 -0.37 -0.27 0.42 0.25 0.03 0.24 0.07 1 5.6e-16 4.9
## x9 0.73 -0.16 0.24 -0.10 -0.14 0.12 0.02 -0.34 -0.44 -0.15 1 1.1e-16 2.9
## x10 0.45 0.52 0.33 0.54 0.29 0.02 0.18 -0.05 -0.01 0.03 1 2.2e-16 4.5
##
## PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
## SS loadings 3.92 1.01 0.88 0.77 0.74 0.69 0.61 0.54 0.45 0.39
## Proportion Var 0.39 0.10 0.09 0.08 0.07 0.07 0.06 0.05 0.04 0.04
## Cumulative Var 0.39 0.49 0.58 0.66 0.73 0.80 0.86 0.92 0.96 1.00
## Proportion Explained 0.39 0.10 0.09 0.08 0.07 0.07 0.06 0.05 0.04 0.04
## Cumulative Proportion 0.39 0.49 0.58 0.66 0.73 0.80 0.86 0.92 0.96 1.00
##
## Mean item complexity = 3.9
## Test of the hypothesis that 10 components are sufficient.
##
## The root mean square of the residuals (RMSR) is 0
## with the empirical chi square 0 with prob < NA
##
## Fit based upon off diagonal values = 1
The eigenvalue of 3.92 shows how much of the total variance is explained by the first factor.
The RCFA analyzes the correlation matrix where each item is standardized to have a vairnace of 1. With 10 items, the eigenvalues combined will add up to 10. With 3.92 out of 10 being explained by the first factor, we say that the first factor explains 39.2% of the variance in the set of items.
Any factor with an eigenvalue of less than 1.0 can usually be ignored
We drop the last loading because the eigenvalue is so low
<-
pcfa %>%
nlsy.data select(x1:x9) %>%
drop_na() %>%
principal(covar = TRUE, rotate = "none", nfactors = 9)
pcfa
## Principal Components Analysis
## Call: principal(r = ., nfactors = 9, rotate = "none", covar = TRUE)
## Standardized loadings (pattern matrix) based upon correlation matrix
## PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 h2 u2 com
## x1 0.62 0.24 0.40 -0.20 -0.07 0.53 0.25 -0.06 -0.07 1 -8.9e-16 3.8
## x2 0.59 -0.26 0.57 -0.17 0.20 -0.16 -0.38 0.14 0.00 1 1.1e-15 3.9
## x3 0.72 -0.29 -0.12 -0.25 -0.32 -0.08 0.06 -0.17 0.42 1 -6.7e-16 3.1
## x4 0.71 -0.35 -0.16 -0.07 -0.32 -0.16 0.15 0.10 -0.42 1 -4.4e-16 3.2
## x5 0.58 -0.07 0.28 0.71 0.01 -0.12 0.21 -0.07 0.08 1 0.0e+00 2.6
## x6 0.62 0.54 -0.16 0.14 -0.31 0.02 -0.24 0.36 0.08 1 2.2e-16 3.9
## x7 0.61 0.41 -0.06 -0.25 0.38 -0.40 0.30 0.02 0.01 1 -2.2e-16 4.5
## x8 0.60 -0.35 -0.40 0.09 0.43 0.32 0.02 0.23 0.08 1 -6.7e-16 4.6
## x9 0.74 0.19 -0.24 0.09 0.12 0.06 -0.34 -0.45 -0.15 1 3.3e-16 2.8
##
## PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9
## SS loadings 3.76 0.95 0.85 0.75 0.69 0.62 0.54 0.45 0.39
## Proportion Var 0.42 0.11 0.09 0.08 0.08 0.07 0.06 0.05 0.04
## Cumulative Var 0.42 0.52 0.62 0.70 0.78 0.85 0.91 0.96 1.00
## Proportion Explained 0.42 0.11 0.09 0.08 0.08 0.07 0.06 0.05 0.04
## Cumulative Proportion 0.42 0.52 0.62 0.70 0.78 0.85 0.91 0.96 1.00
##
## Mean item complexity = 3.6
## Test of the hypothesis that 9 components are sufficient.
##
## The root mean square of the residuals (RMSR) is 0
## with the empirical chi square 0 with prob < NA
##
## Fit based upon off diagonal values = 1
- These are good results because only one factor has an eigenvalue greater than 1.- and all 9 items load over 0.5 on that factor
1.2 Alpha reliability for our nine-item scale
- the next step is to assess the reliability of our 9 item scale of conservatism
::alpha(nlsy.data[ , c(2:10)]) psych
##
## Reliability analysis
## Call: psych::alpha(x = nlsy.data[, c(2:10)])
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.81 0.82 0.82 0.34 4.5 0.003 1.8 0.51 0.33
##
## lower alpha upper 95% confidence boundaries
## 0.8 0.81 0.81
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## x1 0.79 0.80 0.80 0.34 4.1 0.0033 0.0073 0.33
## x2 0.79 0.81 0.80 0.34 4.2 0.0032 0.0065 0.34
## x3 0.78 0.79 0.78 0.32 3.8 0.0034 0.0050 0.33
## x4 0.78 0.79 0.78 0.32 3.8 0.0034 0.0052 0.33
## x5 0.79 0.81 0.80 0.35 4.3 0.0032 0.0069 0.35
## x6 0.79 0.80 0.79 0.34 4.1 0.0033 0.0054 0.34
## x7 0.79 0.81 0.80 0.34 4.1 0.0033 0.0067 0.34
## x8 0.80 0.81 0.80 0.35 4.2 0.0032 0.0056 0.34
## x9 0.77 0.79 0.78 0.32 3.7 0.0035 0.0057 0.32
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## x1 1833 0.66 0.63 0.56 0.51 2.3 1.02
## x2 1859 0.59 0.60 0.52 0.46 1.6 0.80
## x3 1874 0.67 0.70 0.67 0.58 1.4 0.67
## x4 1872 0.66 0.70 0.66 0.57 1.4 0.63
## x5 1815 0.58 0.59 0.50 0.45 1.8 0.81
## x6 1811 0.65 0.62 0.56 0.51 2.3 0.93
## x7 1775 0.66 0.61 0.54 0.49 2.2 1.07
## x8 1875 0.54 0.59 0.51 0.44 1.3 0.56
## x9 1847 0.72 0.73 0.69 0.63 1.7 0.74
##
## Non missing response frequency for each item
## 1 2 3 4 miss
## x1 0.25 0.34 0.25 0.16 0.80
## x2 0.54 0.33 0.09 0.04 0.79
## x3 0.67 0.27 0.05 0.02 0.79
## x4 0.70 0.25 0.04 0.01 0.79
## x5 0.43 0.41 0.12 0.04 0.80
## x6 0.22 0.40 0.26 0.12 0.80
## x7 0.32 0.28 0.23 0.16 0.80
## x8 0.73 0.23 0.03 0.01 0.79
## x9 0.44 0.43 0.10 0.02 0.79
the alpha is 0.81, which is over the 0.70 minimum value standard
the
raw_alpha
column shows what would happen if we dropped any single item from our scale. If dropping an item would raise the alpha then we should consider if the item is measuing the same concept as the other itemsto obtain the scale score for each person in our sample, we would simply compute the total or mean score for the 9 items
<-
nlsy.data %>%
nlsy.data rowwise() %>%
mutate(m = mean(c(x1, x2, x3, x4, x5, x6, x7, x8, x9), na.rm = TRUE))
describe(nlsy.data$m)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 1888 1.78 0.51 1.69 1.74 0.53 1 4 3 0.72 0.53 0.01
hist(nlsy.data$m)
1.3 Generating a factor score rather than a mean of summative score
you can generate a factor score that weights each item according to how salient it is to the concept being measured
if the scores are similar then factor scores will be very similar to simple mean scores
# pcfa <-
# nlsy.data %>%
# select(x1:x9) %>%
# drop_na() %>%
# principal(covar = TRUE, rotate = "none", nfactors = 9, scores = TRUE)
# nlsy.data$factor_scores <- pcfa$scores
#
# hist(nlsy.data$factor_scores)
# cor(nlsy.data$m, nlsy.data$factor_scores)
1.4 What can CFA add?
CFA and factor analysis methods allow each item to have its own variance
confirmatory models specify the factor that underlies the responses to items - all the items are indicators of conservatism
with CFA we estimate the variance of each item. Thus we acknowledge that each item may have some unique variance that we are treating as random error
we get better measures of the latent variable by isolating the shared variance from each item’s variance
1.5 Fitting a CFA model
library(lavaan)
## This is lavaan 0.6-9
## lavaan is FREE software! Please report any bugs.
##
## Attaching package: 'lavaan'
## The following object is masked from 'package:psych':
##
## cor2cov
<- '
cfa.model # latent variable model
conservative =~ 1 * x1 + a2 * x2 + a3 * x3 + a4 * x4 + a5 * x5 + a6 * x6 +
a7 * x7 + a8 * x8 + a9 * x9
'
<- cfa(cfa.model, data = nlsy.data)
cfa.fit summary(cfa.fit)
## lavaan 0.6-9 ended normally after 24 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 18
##
## Used Total
## Number of observations 1625 8985
##
## Model Test User Model:
##
## Test statistic 419.007
## Degrees of freedom 27
## P-value (Chi-square) 0.000
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## conservative =~
## x1 1.000
## x2 (a2) 0.738 0.046 16.153 0.000
## x3 (a3) 0.827 0.043 19.425 0.000
## x4 (a4) 0.756 0.039 19.222 0.000
## x5 (a5) 0.738 0.046 15.941 0.000
## x6 (a6) 0.915 0.054 16.992 0.000
## x7 (a7) 1.028 0.062 16.607 0.000
## x8 (a8) 0.549 0.033 16.675 0.000
## x9 (a9) 0.928 0.048 19.475 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## .x1 0.729 0.028 26.051 0.000
## .x2 0.471 0.018 26.439 0.000
## .x3 0.240 0.010 23.378 0.000
## .x4 0.215 0.009 23.722 0.000
## .x5 0.495 0.019 26.540 0.000
## .x6 0.590 0.023 25.970 0.000
## .x7 0.820 0.031 26.201 0.000
## .x8 0.230 0.009 26.162 0.000
## .x9 0.297 0.013 23.287 0.000
## conservative 0.316 0.029 11.039 0.000
1.6 Interpreting and presenting CFA results
this first factor loading is fixed at 1 and is called the reference indicator
standardized estimates fix all variances to 1
standardizedSolution(cfa.fit)
## lhs op rhs label est.std se z pvalue ci.lower
## 1 conservative =~ x1 0.550 0.020 27.583 0 0.511
## 2 conservative =~ x2 a2 0.517 0.021 24.926 0 0.476
## 3 conservative =~ x3 a3 0.688 0.016 42.841 0 0.657
## 4 conservative =~ x4 a4 0.676 0.016 41.107 0 0.643
## 5 conservative =~ x5 a5 0.508 0.021 24.208 0 0.467
## 6 conservative =~ x6 a6 0.556 0.020 28.118 0 0.517
## 7 conservative =~ x7 a7 0.538 0.020 26.575 0 0.498
## 8 conservative =~ x8 a8 0.541 0.020 26.838 0 0.501
## 9 conservative =~ x9 a9 0.691 0.016 43.295 0 0.660
## 10 x1 ~~ x1 0.698 0.022 31.834 0 0.655
## 11 x2 ~~ x2 0.733 0.021 34.138 0 0.690
## 12 x3 ~~ x3 0.526 0.022 23.804 0 0.483
## 13 x4 ~~ x4 0.544 0.022 24.471 0 0.500
## 14 x5 ~~ x5 0.742 0.021 34.848 0 0.700
## 15 x6 ~~ x6 0.691 0.022 31.422 0 0.648
## 16 x7 ~~ x7 0.711 0.022 32.654 0 0.668
## 17 x8 ~~ x8 0.707 0.022 32.434 0 0.665
## 18 x9 ~~ x9 0.522 0.022 23.635 0 0.479
## 19 conservative ~~ conservative 1.000 0.000 NA NA 1.000
## ci.upper
## 1 0.589
## 2 0.558
## 3 0.720
## 4 0.708
## 5 0.549
## 6 0.595
## 7 0.577
## 8 0.580
## 9 0.723
## 10 0.741
## 11 0.775
## 12 0.570
## 13 0.587
## 14 0.784
## 15 0.734
## 16 0.753
## 17 0.750
## 18 0.565
## 19 1.000
library(semPlot)
semPaths(cfa.fit, what = "est", edge.label.cex = 0.75, edge.color = "black")
interpretation of the loadings: if you are 1 standard deviation higher on Conservative, you will respond 0.55 standarad deviations higher on
x1
, 0.52 standard deviations higher onx2
, etc.With CFA we can also estimate reliability which is the propostion of the total variation in a scale formed by our indicators that is attributed to the true score
1.7 Assessing goodness of fit
summary(cfa.fit, fit.measures = TRUE)
## lavaan 0.6-9 ended normally after 24 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 18
##
## Used Total
## Number of observations 1625 8985
##
## Model Test User Model:
##
## Test statistic 419.007
## Degrees of freedom 27
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 3872.316
## Degrees of freedom 36
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.898
## Tucker-Lewis Index (TLI) 0.864
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -15593.729
## Loglikelihood unrestricted model (H1) NA
##
## Akaike (AIC) 31223.457
## Bayesian (BIC) 31320.536
## Sample-size adjusted Bayesian (BIC) 31263.353
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.095
## 90 Percent confidence interval - lower 0.087
## 90 Percent confidence interval - upper 0.103
## P-value RMSEA <= 0.05 0.000
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.049
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## conservative =~
## x1 1.000
## x2 (a2) 0.738 0.046 16.153 0.000
## x3 (a3) 0.827 0.043 19.425 0.000
## x4 (a4) 0.756 0.039 19.222 0.000
## x5 (a5) 0.738 0.046 15.941 0.000
## x6 (a6) 0.915 0.054 16.992 0.000
## x7 (a7) 1.028 0.062 16.607 0.000
## x8 (a8) 0.549 0.033 16.675 0.000
## x9 (a9) 0.928 0.048 19.475 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## .x1 0.729 0.028 26.051 0.000
## .x2 0.471 0.018 26.439 0.000
## .x3 0.240 0.010 23.378 0.000
## .x4 0.215 0.009 23.722 0.000
## .x5 0.495 0.019 26.540 0.000
## .x6 0.590 0.023 25.970 0.000
## .x7 0.820 0.031 26.201 0.000
## .x8 0.230 0.009 26.162 0.000
## .x9 0.297 0.013 23.287 0.000
## conservative 0.316 0.029 11.039 0.000
modificationIndices(cfa.fit)
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 20 x1 ~~ x2 27.046 0.083 0.083 0.142 0.142
## 21 x1 ~~ x3 1.645 -0.016 -0.016 -0.038 -0.038
## 22 x1 ~~ x4 22.437 -0.055 -0.055 -0.139 -0.139
## 23 x1 ~~ x5 1.255 0.018 0.018 0.030 0.030
## 24 x1 ~~ x6 16.165 0.073 0.073 0.111 0.111
## 25 x1 ~~ x7 8.434 0.062 0.062 0.080 0.080
## 26 x1 ~~ x8 13.662 -0.042 -0.042 -0.101 -0.101
## 27 x1 ~~ x9 0.798 -0.012 -0.012 -0.027 -0.027
## 28 x2 ~~ x3 2.945 0.017 0.017 0.050 0.050
## 29 x2 ~~ x4 0.252 0.005 0.005 0.015 0.015
## 30 x2 ~~ x5 16.394 0.053 0.053 0.109 0.109
## 31 x2 ~~ x6 39.084 -0.090 -0.090 -0.171 -0.171
## 32 x2 ~~ x7 0.083 0.005 0.005 0.008 0.008
## 33 x2 ~~ x8 1.676 -0.012 -0.012 -0.035 -0.035
## 34 x2 ~~ x9 11.090 -0.036 -0.036 -0.098 -0.098
## 35 x3 ~~ x4 147.975 0.089 0.089 0.392 0.392
## 36 x3 ~~ x5 18.664 -0.043 -0.043 -0.126 -0.126
## 37 x3 ~~ x6 15.379 -0.044 -0.044 -0.116 -0.116
## 38 x3 ~~ x7 15.291 -0.051 -0.051 -0.115 -0.115
## 39 x3 ~~ x8 1.463 0.008 0.008 0.036 0.036
## 40 x3 ~~ x9 16.558 -0.036 -0.036 -0.133 -0.133
## 41 x4 ~~ x5 0.441 0.006 0.006 0.019 0.019
## 42 x4 ~~ x6 12.898 -0.038 -0.038 -0.106 -0.106
## 43 x4 ~~ x7 22.944 -0.059 -0.059 -0.140 -0.140
## 44 x4 ~~ x8 11.218 0.022 0.022 0.098 0.098
## 45 x4 ~~ x9 30.371 -0.045 -0.045 -0.178 -0.178
## 46 x5 ~~ x6 1.967 0.021 0.021 0.038 0.038
## 47 x5 ~~ x7 3.838 -0.034 -0.034 -0.053 -0.053
## 48 x5 ~~ x8 0.168 0.004 0.004 0.011 0.011
## 49 x5 ~~ x9 0.013 0.001 0.001 0.003 0.003
## 50 x6 ~~ x7 29.683 0.104 0.104 0.150 0.150
## 51 x6 ~~ x8 31.435 -0.057 -0.057 -0.154 -0.154
## 52 x6 ~~ x9 62.385 0.098 0.098 0.235 0.235
## 53 x7 ~~ x8 1.752 -0.016 -0.016 -0.036 -0.036
## 54 x7 ~~ x9 19.055 0.064 0.064 0.129 0.129
## 55 x8 ~~ x9 16.553 0.031 0.031 0.120 0.120
a modification index is an estimate of how much the chi-squared will be reduced if we estimated a particular extra parameter
because items 2 and 8 aren’t great indicators of conservatism we will drop them. We’ll also allow for items 3 and 4 to be correlated
<- '
cfa.model # latent variable model
conservative =~ a1 * x1 + a3 * x3 + a4 * x4 + a5 * x5 + a6 * x6 + a7 * x7 +
a9 * x9
# covariances
x3 ~~ x4
'
<- cfa(cfa.model, data = nlsy.data)
cfa.fit summary(cfa.fit, standardized = TRUE, fit.measures = TRUE, rsquare = TRUE)
## lavaan 0.6-9 ended normally after 28 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 15
##
## Used Total
## Number of observations 1631 8985
##
## Model Test User Model:
##
## Test statistic 56.021
## Degrees of freedom 13
## P-value (Chi-square) 0.000
##
## Model Test Baseline Model:
##
## Test statistic 2870.789
## Degrees of freedom 21
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 0.985
## Tucker-Lewis Index (TLI) 0.976
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -12634.282
## Loglikelihood unrestricted model (H1) NA
##
## Akaike (AIC) 25298.564
## Bayesian (BIC) 25379.519
## Sample-size adjusted Bayesian (BIC) 25331.866
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.045
## 90 Percent confidence interval - lower 0.033
## 90 Percent confidence interval - upper 0.057
## P-value RMSEA <= 0.05 0.729
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.023
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## conservative =~
## x1 (a1) 1.000 0.573 0.560
## x3 (a3) 0.685 0.040 16.946 0.000 0.393 0.580
## x4 (a4) 0.618 0.037 16.592 0.000 0.354 0.563
## x5 (a5) 0.705 0.046 15.279 0.000 0.404 0.496
## x6 (a6) 1.037 0.057 18.130 0.000 0.594 0.642
## x7 (a7) 1.078 0.064 16.944 0.000 0.618 0.575
## x9 (a9) 0.955 0.049 19.311 0.000 0.547 0.727
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .x3 ~~
## .x4 0.110 0.009 12.093 0.000 0.110 0.384
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .x1 0.717 0.029 25.054 0.000 0.717 0.686
## .x3 0.304 0.012 24.449 0.000 0.304 0.663
## .x4 0.270 0.011 24.762 0.000 0.270 0.683
## .x5 0.502 0.019 26.080 0.000 0.502 0.754
## .x6 0.504 0.022 23.153 0.000 0.504 0.588
## .x7 0.772 0.031 24.769 0.000 0.772 0.669
## .x9 0.268 0.013 19.898 0.000 0.268 0.472
## conservative 0.329 0.030 10.957 0.000 1.000 1.000
##
## R-Square:
## Estimate
## x1 0.314
## x3 0.337
## x4 0.317
## x5 0.246
## x6 0.412
## x7 0.331
## x9 0.528
standardizedSolution(cfa.fit)
## lhs op rhs label est.std se z pvalue ci.lower
## 1 conservative =~ x1 a1 0.560 0.021 27.163 0 0.520
## 2 conservative =~ x3 a3 0.580 0.020 28.457 0 0.540
## 3 conservative =~ x4 a4 0.563 0.021 26.987 0 0.522
## 4 conservative =~ x5 a5 0.496 0.022 22.383 0 0.452
## 5 conservative =~ x6 a6 0.642 0.019 34.481 0 0.605
## 6 conservative =~ x7 a7 0.575 0.020 28.375 0 0.536
## 7 conservative =~ x9 a9 0.727 0.017 43.978 0 0.694
## 8 x3 ~~ x4 0.384 0.024 16.223 0 0.338
## 9 x1 ~~ x1 0.686 0.023 29.653 0 0.641
## 10 x3 ~~ x3 0.663 0.024 28.035 0 0.617
## 11 x4 ~~ x4 0.683 0.023 29.089 0 0.637
## 12 x5 ~~ x5 0.754 0.022 34.374 0 0.711
## 13 x6 ~~ x6 0.588 0.024 24.624 0 0.541
## 14 x7 ~~ x7 0.669 0.023 28.684 0 0.623
## 15 x9 ~~ x9 0.472 0.024 19.629 0 0.425
## 16 conservative ~~ conservative 1.000 0.000 NA NA 1.000
## ci.upper
## 1 0.601
## 2 0.620
## 3 0.604
## 4 0.539
## 5 0.678
## 6 0.615
## 7 0.759
## 8 0.431
## 9 0.731
## 10 0.710
## 11 0.729
## 12 0.797
## 13 0.635
## 14 0.715
## 15 0.519
## 16 1.000
1.7.1 Final model and estimating scale reliability
- fix the variance of conservative to 1
<- '
cfa.model # latent variable model
conservative =~ x1 + x3 + x4 + x5 + x6 + x7 + x9
# (co)variances
x3 ~~ x4
conservative ~~ 1 * conservative
'
<- cfa(cfa.model, data = nlsy.data, std.lv = TRUE)
cfa.fit summary(cfa.fit)
## lavaan 0.6-9 ended normally after 19 iterations
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 15
##
## Used Total
## Number of observations 1631 8985
##
## Model Test User Model:
##
## Test statistic 56.021
## Degrees of freedom 13
## P-value (Chi-square) 0.000
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## conservative =~
## x1 0.573 0.026 21.914 0.000
## x3 0.393 0.017 22.656 0.000
## x4 0.354 0.016 21.830 0.000
## x5 0.404 0.021 19.017 0.000
## x6 0.594 0.023 25.749 0.000
## x7 0.618 0.027 22.595 0.000
## x9 0.547 0.018 29.977 0.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## .x3 ~~
## .x4 0.110 0.009 12.093 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## conservative 1.000
## .x1 0.717 0.029 25.054 0.000
## .x3 0.304 0.012 24.449 0.000
## .x4 0.270 0.011 24.762 0.000
## .x5 0.502 0.019 26.080 0.000
## .x6 0.504 0.022 23.153 0.000
## .x7 0.772 0.031 24.769 0.000
## .x9 0.268 0.013 19.898 0.000
standardizedSolution(cfa.fit)
## lhs op rhs est.std se z pvalue ci.lower ci.upper
## 1 conservative =~ x1 0.560 0.021 27.163 0 0.520 0.601
## 2 conservative =~ x3 0.580 0.020 28.457 0 0.540 0.620
## 3 conservative =~ x4 0.563 0.021 26.987 0 0.522 0.604
## 4 conservative =~ x5 0.496 0.022 22.383 0 0.452 0.539
## 5 conservative =~ x6 0.642 0.019 34.481 0 0.605 0.678
## 6 conservative =~ x7 0.575 0.020 28.375 0 0.536 0.615
## 7 conservative =~ x9 0.727 0.017 43.978 0 0.694 0.759
## 8 x3 ~~ x4 0.384 0.024 16.223 0 0.338 0.431
## 9 conservative ~~ conservative 1.000 0.000 NA NA 1.000 1.000
## 10 x1 ~~ x1 0.686 0.023 29.653 0 0.641 0.731
## 11 x3 ~~ x3 0.663 0.024 28.035 0 0.617 0.710
## 12 x4 ~~ x4 0.683 0.023 29.089 0 0.637 0.729
## 13 x5 ~~ x5 0.754 0.022 34.374 0 0.711 0.797
## 14 x6 ~~ x6 0.588 0.024 24.624 0 0.541 0.635
## 15 x7 ~~ x7 0.669 0.023 28.684 0 0.623 0.715
## 16 x9 ~~ x9 0.472 0.024 19.629 0 0.425 0.519
- now plot the final model
library(lavaanPlot)
lavaanPlot(model = cfa.fit, coefs = TRUE, stand = TRUE)
semPaths(cfa.fit)
1.8 A two-factor model
- now we add an additional indicator
Depress
which is measured by three indicators
1.8.1 Evaluating the depression dimension
<- '
dep.model Depress =~ x11 + x12 + x13
'
<- cfa(dep.model, data = nlsy.data)
dep.fit standardizedSolution(dep.fit)
## lhs op rhs est.std se z pvalue ci.lower ci.upper
## 1 Depress =~ x11 0.813 0.010 79.821 0 0.793 0.833
## 2 Depress =~ x12 -0.609 0.010 -59.602 0 -0.629 -0.589
## 3 Depress =~ x13 0.655 0.010 64.725 0 0.635 0.675
## 4 x11 ~~ x11 0.339 0.017 20.458 0 0.306 0.371
## 5 x12 ~~ x12 0.629 0.012 50.592 0 0.605 0.654
## 6 x13 ~~ x13 0.571 0.013 43.112 0 0.545 0.597
## 7 Depress ~~ Depress 1.000 0.000 NA NA 1.000 1.000
item
x12
would need to be reverse coded if we were using a traditional summated scaleit is useful to have the majority of the indicators coded so that a higher score is associated with more of the concept
1.8.2 Estimating a two-factor model
= '
two.fac.model # latent variable models
Conservative =~ x1 + x3 + x4 + x5 + x6 + x7 + x9
Depress =~ x11 + x12 + x13
# (co)variances
x3 ~~ x4
'
<- cfa(two.fac.model, data = nlsy.data)
two.fac.fit standardizedSolution(two.fac.fit)
## lhs op rhs est.std se z pvalue ci.lower ci.upper
## 1 Conservative =~ x1 0.555 0.022 25.456 0.000 0.512 0.598
## 2 Conservative =~ x3 0.578 0.021 26.991 0.000 0.536 0.620
## 3 Conservative =~ x4 0.565 0.022 25.914 0.000 0.522 0.608
## 4 Conservative =~ x5 0.482 0.024 20.472 0.000 0.436 0.529
## 5 Conservative =~ x6 0.663 0.019 35.068 0.000 0.626 0.700
## 6 Conservative =~ x7 0.581 0.021 27.491 0.000 0.539 0.622
## 7 Conservative =~ x9 0.730 0.017 42.531 0.000 0.697 0.764
## 8 Depress =~ x11 0.811 0.022 36.108 0.000 0.767 0.855
## 9 Depress =~ x12 -0.615 0.023 -27.212 0.000 -0.659 -0.570
## 10 Depress =~ x13 0.647 0.022 28.880 0.000 0.603 0.691
## 11 x3 ~~ x4 0.380 0.025 15.176 0.000 0.331 0.429
## 12 x1 ~~ x1 0.692 0.024 28.613 0.000 0.645 0.740
## 13 x3 ~~ x3 0.665 0.025 26.836 0.000 0.617 0.714
## 14 x4 ~~ x4 0.681 0.025 27.600 0.000 0.632 0.729
## 15 x5 ~~ x5 0.767 0.023 33.756 0.000 0.723 0.812
## 16 x6 ~~ x6 0.560 0.025 22.317 0.000 0.511 0.609
## 17 x7 ~~ x7 0.663 0.025 27.007 0.000 0.615 0.711
## 18 x9 ~~ x9 0.467 0.025 18.605 0.000 0.417 0.516
## 19 x11 ~~ x11 0.342 0.036 9.389 0.000 0.271 0.414
## 20 x12 ~~ x12 0.622 0.028 22.420 0.000 0.568 0.677
## 21 x13 ~~ x13 0.581 0.029 20.028 0.000 0.524 0.638
## 22 Conservative ~~ Conservative 1.000 0.000 NA NA 1.000 1.000
## 23 Depress ~~ Depress 1.000 0.000 NA NA 1.000 1.000
## 24 Conservative ~~ Depress 0.115 0.033 3.469 0.001 0.050 0.180
::tidy(cfa.fit, conf.int = TRUE) broom
## # A tibble: 16 × 11
## term op estimate std.error statistic p.value conf.low conf.high std.lv
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 conserv… =~ 0.573 0.0262 21.9 0 0.522 0.624 0.573
## 2 conserv… =~ 0.393 0.0173 22.7 0 0.359 0.427 0.393
## 3 conserv… =~ 0.354 0.0162 21.8 0 0.322 0.386 0.354
## 4 conserv… =~ 0.404 0.0213 19.0 0 0.363 0.446 0.404
## 5 conserv… =~ 0.594 0.0231 25.7 0 0.549 0.640 0.594
## 6 conserv… =~ 0.618 0.0273 22.6 0 0.564 0.671 0.618
## 7 conserv… =~ 0.547 0.0183 30.0 0 0.512 0.583 0.547
## 8 x3 ~~ x4 ~~ 0.110 0.00912 12.1 0 0.0924 0.128 0.110
## 9 conserv… ~~ 1 0 NA NA 1 1 1
## 10 x1 ~~ x1 ~~ 0.717 0.0286 25.1 0 0.661 0.773 0.717
## 11 x3 ~~ x3 ~~ 0.304 0.0124 24.4 0 0.280 0.329 0.304
## 12 x4 ~~ x4 ~~ 0.270 0.0109 24.8 0 0.249 0.292 0.270
## 13 x5 ~~ x5 ~~ 0.502 0.0192 26.1 0 0.464 0.540 0.502
## 14 x6 ~~ x6 ~~ 0.504 0.0218 23.2 0 0.462 0.547 0.504
## 15 x7 ~~ x7 ~~ 0.772 0.0312 24.8 0 0.711 0.833 0.772
## 16 x9 ~~ x9 ~~ 0.268 0.0134 19.9 0 0.241 0.294 0.268
## # … with 2 more variables: std.all <dbl>, std.nox <dbl>
semPaths(two.fac.fit)
modificationIndices(two.fac.fit)
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 25 Conservative =~ x11 2.254 -0.048 -0.027 -0.041 -0.041
## 26 Conservative =~ x12 6.017 0.073 0.041 0.064 0.064
## 27 Conservative =~ x13 16.582 0.107 0.061 0.105 0.105
## 28 Depress =~ x1 0.011 -0.005 -0.003 -0.003 -0.003
## 29 Depress =~ x3 2.077 0.043 0.023 0.034 0.034
## 30 Depress =~ x4 0.000 0.000 0.000 0.000 0.000
## 31 Depress =~ x5 3.603 -0.079 -0.043 -0.052 -0.052
## 32 Depress =~ x6 5.472 -0.101 -0.055 -0.059 -0.059
## 33 Depress =~ x7 4.911 0.117 0.063 0.058 0.058
## 34 Depress =~ x9 0.185 0.015 0.008 0.011 0.011
## 35 x1 ~~ x3 5.651 0.031 0.031 0.065 0.065
## 36 x1 ~~ x4 1.121 -0.013 -0.013 -0.029 -0.029
## 37 x1 ~~ x5 2.792 0.030 0.030 0.049 0.049
## 38 x1 ~~ x6 0.975 0.019 0.019 0.033 0.033
## 39 x1 ~~ x7 2.114 0.034 0.034 0.045 0.045
## 40 x1 ~~ x9 18.483 -0.069 -0.069 -0.157 -0.157
## 41 x1 ~~ x11 1.026 0.012 0.012 0.038 0.038
## 42 x1 ~~ x12 0.018 0.002 0.002 0.004 0.004
## 43 x1 ~~ x13 1.775 -0.015 -0.015 -0.040 -0.040
## 44 x3 ~~ x5 1.963 -0.015 -0.015 -0.037 -0.037
## 45 x3 ~~ x6 4.246 -0.023 -0.023 -0.061 -0.061
## 46 x3 ~~ x7 0.263 -0.007 -0.007 -0.014 -0.014
## 47 x3 ~~ x9 1.484 0.011 0.011 0.039 0.039
## 48 x3 ~~ x11 6.148 0.018 0.018 0.083 0.083
## 49 x3 ~~ x12 2.091 0.011 0.011 0.039 0.039
## 50 x3 ~~ x13 0.099 -0.002 -0.002 -0.009 -0.009
## 51 x4 ~~ x5 26.360 0.050 0.050 0.135 0.135
## 52 x4 ~~ x6 1.634 -0.013 -0.013 -0.037 -0.037
## 53 x4 ~~ x7 1.751 -0.017 -0.017 -0.036 -0.036
## 54 x4 ~~ x9 0.059 -0.002 -0.002 -0.008 -0.008
## 55 x4 ~~ x11 0.717 -0.006 -0.006 -0.028 -0.028
## 56 x4 ~~ x12 1.891 0.010 0.010 0.037 0.037
## 57 x4 ~~ x13 5.704 0.015 0.015 0.065 0.065
## 58 x5 ~~ x6 1.649 -0.020 -0.020 -0.041 -0.041
## 59 x5 ~~ x7 7.188 -0.050 -0.050 -0.080 -0.080
## 60 x5 ~~ x9 0.772 -0.011 -0.011 -0.030 -0.030
## 61 x5 ~~ x11 2.954 -0.017 -0.017 -0.063 -0.063
## 62 x5 ~~ x12 1.324 0.012 0.012 0.034 0.034
## 63 x5 ~~ x13 1.475 0.011 0.011 0.036 0.036
## 64 x6 ~~ x7 1.503 0.025 0.025 0.041 0.041
## 65 x6 ~~ x9 6.477 0.038 0.038 0.108 0.108
## 66 x6 ~~ x11 4.092 -0.021 -0.021 -0.078 -0.078
## 67 x6 ~~ x12 1.071 0.011 0.011 0.032 0.032
## 68 x6 ~~ x13 1.097 0.010 0.010 0.033 0.033
## 69 x7 ~~ x9 0.720 0.014 0.014 0.032 0.032
## 70 x7 ~~ x11 1.050 0.013 0.013 0.038 0.038
## 71 x7 ~~ x12 0.197 0.006 0.006 0.013 0.013
## 72 x7 ~~ x13 2.356 0.018 0.018 0.047 0.047
## 73 x9 ~~ x11 1.600 -0.010 -0.010 -0.052 -0.052
## 74 x9 ~~ x12 3.648 -0.016 -0.016 -0.062 -0.062
## 75 x9 ~~ x13 0.124 0.003 0.003 0.012 0.012
## 76 x11 ~~ x12 16.582 -0.299 -0.299 -1.497 -1.497
## 77 x11 ~~ x13 6.017 -0.181 -0.181 -1.046 -1.046
## 78 x12 ~~ x13 2.254 0.065 0.065 0.287 0.287