Exercise 7 Fitting a single-factor model to polytomous questionnaire data

Data file SDQ.RData
R package psych

7.1 Objectives

The objective of this exercise is to fit a single-factor model to item-level questionnaire data, thereby testing the homogeneity of an item set.

7.2 Worked Example - Testing homogeneity of SDQ Conduct Problems scale

To complete this exercise, you need to repeat the analysis from a worked example below. The worked example includes factor analysis of the SDQ scale Conduct Problems. Once you feel confident, you can repeat the same steps for other SDQ scales.

This exercise continues to make use of the data we considered in Exercise 1 and then again in Exercise 5. These data come from a community sample of pupils from the same school (N=228). They completed the Strengths and Difficulties Questionnaire (SDQ) twice - the first time at the start of secondary school (Year 7), and then one year later (Year 8).

Step 1. Creating or Opening project

If you have already worked with this data set in Exercise 1 or Exercise 5, the simplest thing to do is to continue working within the project created back then. In RStudio, select File / Open Project and navigate to the folder and the project you created. You should see the data frame SDQ appearing in your Environment tab, together with other objects you created and saved.

If you have not completed the previous Exercises or have not saved your work, or simply want to start from scratch, download the data file SDQ.RData into a new folder and follow instructions from Exercise 1 on how to set up a new project and to load the data.

load("SDQ.RData")

Whether you are working in a new project or the old project, create a new R script (File / New File / R Script) for this exercise to keep a separate record of commands needed to conduct test homogeneity analyses.

Step 2. Examining data

Now either press on the SDQ object in the Environment tab or type View(SDQ) to remind yourself of the data set. It contains 228 rows (cases, observations) on 51 variables. Run names(SDQ) to get the variable names. There is a Gender variable (0=male; 1=female), followed by responses to 25 SDQ items at Time 1 named consid, restles, somatic etc. These are followed by 25 more variables named consid2, restles2, somatic2 etc. These are responses to the same SDQ items at Time 2.

Run head(SDQ) from your script to get a quick preview of first 6 rows of data. You should see that there are some missing responses, marked ‘NA’. There are more missing responses for Time 2, with whole rows missing for some pupils.

Step 3. Creating variable list

If you have not done so in previous exercises, begin by creating a list of items that measure Conduct Problems (you can see them in a table given in Exercise 1). This will enable easy reference to data from these 5 variables in all analyses. We will use c() - the base R function for combining values into a list.

items_conduct <- c("tantrum","obeys","fights","lies","steals")

Note how a new object items_conduct appeared in the Environment tab. Try calling SDQ[items_conduct] to pull only the data from these 5 items.

Step 4. Examining suitability of data for factor analysis

Now, load the package psych to enable access to its functionality:

library(psych)

Before starting factor analysis, check correlations between responses to the 5 Conduct Problems items. Package psych has function corr.test(), which prints the full correlation matrix, pairwise sample sizes and corresponding p-values for the significance of correlations. If you want just the correlation matrix in a compact format, use function lowerCor() from this package instead.

lowerCor(SDQ[items_conduct])
##         tntrm obeys fghts lies  stels
## tantrum  1.00                        
## obeys   -0.46  1.00                  
## fights   0.42 -0.50  1.00            
## lies     0.44 -0.30  0.25  1.00      
## steals   0.30 -0.26  0.35  0.23  1.00

QUESTION 1. What can you say about the size and direction of inter-item correlations? Do you think these data are suitable for factor analysis?

To obtain the measure of sampling adequacy (MSA) - an index summarizing the correlations on their overall potential to measure something in common - request the Kaiser-Meyer-Olkin (KMO) index:

KMO(SDQ[items_conduct])
## Kaiser-Meyer-Olkin factor adequacy
## Call: KMO(r = SDQ[items_conduct])
## Overall MSA =  0.76
## MSA for each item = 
## tantrum   obeys  fights    lies  steals 
##    0.75    0.75    0.74    0.76    0.82

Kaiser (1975) proposed the following guidelines for interpreting the MSA and deciding on utility of factor analysis:

MSA range Interpretation
0.9 to 1 Marvelous data
0.8 to 0.9 Meritorious data
0.7 to 0.8 Middling
0.6 to 0.7 Mediocre
0.5 to 0.6 Miserable
0.0 to 0.5 Totally useless

QUESTION 2. Interpret the resulting measure of sampling adequacy (MSA).

Step 5. Determining the number of factors

Package psych has a function producing Scree plots for the observed data and a simulated random (i.e. uncorrelated) data matrix of the same size. Comparison of the two scree plots is called Parallel Analysis. We retain factors from the blue scree plot (real data) that are ABOVE the red plot (simulated random data), in which we expect no common variance, only random variance. In Scree plots, important factors that account for common variance resemble points on hard rock of a mountain, and trivial factors only accounting for random variance are compared with rubble at the bottom of a mountain (“scree” in geology). While examining a Scree Plot of the empirical data is helpful in preliminary decisions on which factors belong to the hard rock and which belong to the rubble, Parallel Analysis provides a statistical comparison with the baseline for this size data.

Function fa.parallel(x,…, fa="both",…) has only one required argument – the actual data (x). There are many other arguments, but they all have default values and therefore are not required if we are happy with the defaults. We will change only one default - the type of eigenvalues shown. We will keep it simple and display only eigenvalues for principal components (fa="pc"), as done in some commercial software such as Mplus.

fa.parallel(SDQ[items_conduct], fa="pc")

## Parallel analysis suggests that the number of factors =  NA  and the number of components =  1

QUESTION 3. Examine the Scree plot. Does it support the hypothesis that there is only one factor underlying responses to Conduct Problems items? Why? Does your conclusion agree with the text output of the parallel analysis function?

Step 6. Fitting and interpreting a single-factor model

Now let’s run the factor analysis, extracting one factor. I recommend you call documentation on function fa() from package psych by running command ?fa. From the general form of this function, fa(r, nfactors=1,…), it is clear that we need to supply the data (argument r, which can be a correlation or covariance matrix or a raw data matrix, like we have). Other arguments all have defaults. The default estimation method is fm="minres". This method will minimise the residuals (differences between the observed and the reproduced correlations), and it is probably a good choice for these data (the sample size is not that large, and responses in only 3 categories cannot be normally distributed). The default number of factors to extract (nfactors=1) is exactly what we want, however, I recommend you write this explicitly for your own future reference:

fa(SDQ[items_conduct], nfactors=1)
## Factor Analysis using method =  minres
## Call: fa(r = SDQ[items_conduct], nfactors = 1)
## Standardized loadings (pattern matrix) based upon correlation matrix
##           MR1   h2   u2 com
## tantrum  0.71 0.51 0.49   1
## obeys   -0.67 0.44 0.56   1
## fights   0.65 0.42 0.58   1
## lies     0.49 0.24 0.76   1
## steals   0.45 0.20 0.80   1
## 
##                 MR1
## SS loadings    1.82
## Proportion Var 0.36
## 
## Mean item complexity =  1
## Test of the hypothesis that 1 factor is sufficient.
## 
## df null model =  10  with the objective function =  1 with Chi Square =  223.41
## df of  the model are 5  and the objective function was  0.07 
## 
## The root mean square of the residuals (RMSR) is  0.05 
## The df corrected root mean square of the residuals is  0.07 
## 
## The harmonic n.obs is  226 with the empirical chi square  11.73  with prob <  0.039 
## The total n.obs was  228  with Likelihood Chi Square =  14.78  with prob <  0.011 
## 
## Tucker Lewis Index of factoring reliability =  0.908
## RMSEA index =  0.093  and the 90 % confidence intervals are  0.04 0.149
## BIC =  -12.36
## Fit based upon off diagonal values = 0.98
## Measures of factor score adequacy             
##                                                    MR1
## Correlation of (regression) scores with factors   0.87
## Multiple R square of scores with factors          0.76
## Minimum correlation of possible factor scores     0.52

Run this command and examine the output. Try to answer the following questions (refer to instructional materials of your choice for help, I recommend McDonald’s “Test theory”). I will indicate which parts of the output you need to answer each question.

QUESTION 4. Examine the Standardized factor loadings. How do you interpret them? What is the “marker item” for the Conduct Problems scale? [In the “Standardized loadings” output, the loadings are printed in “MR1” column. “MR” stands for the estimation method, “Minimum Residuals”, and “1” stands for the factor number. Here we have only 1 factor, so only one column.]

QUESTION 5. Examine communalities and uniquenesses (look at h2 and u2 values in the table of “Standardized loadings”, respectively). What is communality and uniqueness and how do you interpret them?

QUESTION 6. What is the proportion of variance explained by the factor in the five items (total variance explained)? To answer this question, look for “Proportion Var” entry in the output (in a small table beginning with “SS loadings”).

Step 7. Assessing the model’s goodness of fit

Now let us examine the model’s goodness of fit (GOF). This output starts with the line “Test of the hypothesis that 1 factor is sufficient”. This is the hypothesis that the data complies with a single-factor model (“the model”). We are hoping to retain this hypothesis, therefore hoping for a larger p-value, definitely larger than 0.05. The output will also tell us about the “null model”, which is the model where all items are uncorrelated. We are obviously hoping to reject this null model, and obtain a very small p-value. Both of these models will be tested with the chi-square test, with their respective degrees of freedom. For our single-factor model, there are 5 degrees of freedom, because the model estimates 10 parameters (5 factor loadings and 5 uniquenesses), and there are 5*6/2=15 variances and covariances (sample moments) to estimate them. The degrees of freedom are therefore 15-10=5.

QUESTION 7. Is the chi-square for the tested model significant? Do you accept or reject the single-factor model? [Look for “Likelihood Chi Square” in the output.]

For now, I will ignore some other “fit indices” printed in the output. I will return to them in Exercises dealing with structural equation models (beginning with Exercise 16) .

Now let’s examine the model’s residuals. Residuals are the differences between the observed item correlations (which we computed earlier) and the correlations “reproduced” by the model – that is, correlations of item scores predicted by the model. The smaller the residuals are, the closer the model reproduces the data.

In the model output printed on Console, however, we have only the Root Mean Square Residual (RMSR), which computes the mean of all residuals squared, and then takes the square root of that. The RMSR is a summary measure of the size of residuals, and in a way it is like GOF “effect size” - independent of sample size. You can see that the RMSR=0.05, which is a good (low) value, indicating that the average residual is sufficiently small.

To obtain more detailed output of the residuals, we need to get access to all of the results produced by the function fa(). Call the function again, but this time, assign its results to a new object F_conduct, which we can “interrogate” later.

F_conduct <- fa(SDQ[items_conduct], nfactors=1)

Package psych has a nice function that pulls the residuals from the saved factor analysis results (object F_conduct) and prints them in a user-friendly way:

residuals.psych(F_conduct)
##         tntrm obeys fghts lies  stels
## tantrum  0.49                        
## obeys    0.01  0.56                  
## fights  -0.05 -0.06  0.58            
## lies     0.09  0.03 -0.07  0.76      
## steals  -0.02  0.04  0.06  0.00  0.80

NOTE: On the diagonal are discrepancies between the observed item variances (which equal 1 in this standardised solution) and the “reproduced” item variances (variances explained by the common factor, or communalities). So, on the diagonal we have uniquness = 1-communality. You can check this by comparing the diagonal of the residual matrix with values u2 that we discussed earlier. They should be the same. To remove these values on the diagonal out of sight, use:

residuals.psych(F_conduct, diag=FALSE)
##         tntrm obeys fghts lies  stels
## tantrum    NA                        
## obeys    0.01    NA                  
## fights  -0.05 -0.06    NA            
## lies     0.09  0.03 -0.07    NA      
## steals  -0.02  0.04  0.06  0.00    NA

QUESTION 8. Examine the residual correlations (they are printed OFF the diagonal). What can you say about them? Are there any large residuals? (Hint. Interpret the size of residuals as you would the size of correlation coefficients.)

Step 8. Estimating scale reliability using McDonald’s omega

Since we have confirmed homogeneity of the Conduct Problems scale, we can legitimately estimate its reliability using coefficients alpha or omega. In Exercise 5, we already computed alpha for this scale, so you can refer to that instruction for detail. I will simply quote the result we obtained there - “raw alpha” of 0.72.

To obtain omega, call function omega(), specifying the number of factors (nfactors=1). You need this because various versions of coefficient omega exist for multi-factor models, but you only need the estimate for a homogeneous test, “Omega Total”.

# reverse code any counter-indicative items and put them in a data frame
# for details on this procedure, consult Exercise 1
R_conduct <- reverse.code(c(1,-1,1,1,1), SDQ[items_conduct])
# obtain McDonald's omega
omega(R_conduct, nfactors=1)
## Loading required namespace: GPArotation
## Omega_h for 1 factor is not meaningful, just omega_t
## Warning in schmid(m, nfactors, fm, digits, rotate = rotate, n.obs = n.obs, :
## Omega_h and Omega_asymptotic are not meaningful with one factor
## Omega 
## Call: omegah(m = m, nfactors = nfactors, fm = fm, key = key, flip = flip, 
##     digits = digits, title = title, sl = sl, labels = labels, 
##     plot = plot, n.obs = n.obs, rotate = rotate, Phi = Phi, option = option, 
##     covar = covar)
## Alpha:                 0.73 
## G.6:                   0.7 
## Omega Hierarchical:    0.74 
## Omega H asymptotic:    1 
## Omega Total            0.74 
## 
## Schmid Leiman Factor loadings greater than  0.2 
##            g  F1*   h2   h2   u2 p2 com
## tantrum 0.71      0.51 0.51 0.49  1   1
## obeys-  0.67      0.44 0.44 0.56  1   1
## fights  0.65      0.42 0.42 0.58  1   1
## lies    0.49      0.24 0.24 0.76  1   1
## steals  0.45      0.20 0.20 0.80  1   1
## 
## With Sums of squares  of:
##    g  F1*   h2 
## 1.82 0.00 0.73 
## 
## general/max  2.48   max/min =   1.32232e+16
## mean percent general =  1    with sd =  0 and cv of  0 
## Explained Common Variance of the general factor =  1 
## 
## The degrees of freedom are 5  and the fit is  0.07 
## The number of observations was  228  with Chi Square =  14.78  with prob <  0.011
## The root mean square of the residuals is  0.05 
## The df corrected root mean square of the residuals is  0.07
## RMSEA index =  0.093  and the 10 % confidence intervals are  0.04 0.149
## BIC =  -12.36
## 
## Compare this with the adequacy of just a general factor and no group factors
## The degrees of freedom for just the general factor are 5  and the fit is  0.07 
## The number of observations was  228  with Chi Square =  14.78  with prob <  0.011
## The root mean square of the residuals is  0.05 
## The df corrected root mean square of the residuals is  0.07 
## 
## RMSEA index =  0.093  and the 10 % confidence intervals are  0.04 0.149
## BIC =  -12.36 
## 
## Measures of factor score adequacy             
##                                                  g F1*
## Correlation of scores with factors            0.87   0
## Multiple R square of scores with factors      0.76   0
## Minimum correlation of factor score estimates 0.52  -1
## 
##  Total, General and Subset omega for each subset
##                                                  g  F1*
## Omega total for total scores and subscales    0.74 0.74
## Omega general for total scores and subscales  0.74 0.74
## Omega group for total scores and subscales    0.00 0.00

QUESTION 9. What is the “Omega Total” for Conduct Problems scale score? How does it compare with the “raw alpha” for this scale? Which estimate do you think is more appropriate to assess the reliability of this scale?

Step 9. Saving your work

After you finished work with this exercise, save your R script by pressing the Save icon in the script window. Give the script a meaningful name, for example “SDQ scale homogeneity analyses”. When closing the project by pressing File / Close project, make sure you select Save when prompted to save your ‘Workspace image’ (with extension .RData).

7.3 Further practice - Factor analysis of the remaining SDQ subscales

If you wish to practice further, you can repeat the factor analyses with the remaining SDQ scales. Use the list of items by subscale in Exercise 1 to include the right items in your analyses.

7.4 Solutions

Q1. The correlations are all similar in magnitude (around 0.3-0.4). Correlations between obeys and other items are negative, and between all other items are positive. This is because obeys indicates the low end of Conduct Problems (i.e. the lack of such problems), so it is keyed in the opposite direction to all other items. The data seem suitable for factor analysis because all items correlate – so potentially have a common source for this shared variance.

Q2. MSA = 0.76. The data are “middling” according to Kaiser’s guidelines (i.e. suitable for factor analysis).

Q3. According to the Scree plot of the observed data (in blue) the 1st factor accounts for a substantial amount of variance compared to the 2nd factor. There is a large drop from the 1st factor to the 2nd, and the “rubble” begins with the 2nd factor. This indicates that most co-variation is explained by one factor.

Parallel analysis confirms this conclusion, because the simulated data yields a line (in red) that crosses the observed scree between 1st and 2nd factor, with 1st factor significantly above the red line, and 2nd factor below it. It means that only the 1st factor should be retained, and the 2nd, 3rd and all subsequent factors should be discarded as they are part of the “rubble”.

Q4. Standardized factor loadings reflect the number of Standard Deviations by which the item score will change per 1 SD change in the factor score. The higher the loading, the more sensitive the item is to the change in the factor. Standardised factor loadings in the single-factor model are also correlations between the factor and the items (just like beta coefficients in the simple regression). For the Conduct Problems scale, factor loadings range between 0.45 and 0.71 in magnitude (i.e. absolute value). Factor loadings over 0.5 in magnitude can be considered reasonably high. The loading for obeys is negative (-0.67), which means that higher factor score results in the lower score on this item. The marker item is tantrum with the loading 0.71 (highest loading). The marker item indicates that behaviour described by this item, “I get very angry and often lose my temper”, is central to the meaning of the common factor.

Q5. Communality is the variance in the item due to the common factor, and uniqueness is the unique item variance. In standardised factor solutions (which is what function fa() prints), communality is the proportion of variance in the item due to the factor, and uniqueness is the remaining proportion (1-communality). Looking at the printed values, between 20% and 51% of variance in the items is due to the common factor.

Q6. The factor accounts for 36% of the variability in the five Conduct Problems items (see “Proportion Var” in the output).

Q7. The chi-square test is significant (“prob” or p-value is less than 0.011), which means the hypothesis that the single-factor model holds must be rejected.

Q8. The residuals are all quite close to zero. The largest residual is for correlation between “tantrum” and “lies”, approx. 0.09. It is still a small difference between observed and reproduced correlation (normally we ignore any residuals smaller than 0.1), so we conclude that the correlations are reproduced pretty well.

Q9. Omega Total = 0.74. Omega should provide a better estimate of reliability of a homogeneous scale than alpha because it does not rely on the assumption that all factor loadings are the same as alpha does. Here, alpha (0.72) underestimates the reliability slightly, because not all factor loadings are equal. (alpha can only be lower than omega, and they are equal when all factor loadings are equal).