Exercise 20 Fitting a latent autoregressive model to longitudinal test measurements

Data file WISC.RData
R package lavaan

20.1 Objectives

In this exercise, you will test for autoregressive relationships between latent constructs. This will be done through a so-called full structural model, which includes measurement part(s) (where latent traits are indicated by observed variables) and a structural part (where we explore relationships between latent constructs - ‘true’ scores - rather than their imperfect measures). Incorporating the measurement part will be a way of dealing with unreliable data because the error of measurement can be explicitly controlled. As in Exercise 19, we will model repeated (observed) measurements over several time points, where each subsequent measurement occasion is regressed on the previous one, hence the name. However, we will model the autoregressive relationships not between the observed measures but between the latent variables (‘true’ scores) in the structural part of the model.

20.2 Study of longitudinal stability of the latent ability across 4 time points

Data for this exercise is the same as for Exercise 19. It is taken from a study by Osborne and Suddick (1972). In this study, N=204 children took the Wechsler Intelligence Test for Children (WISC) at ages 6, 7, 9 and 11. There are two subtests - Verbal and Non-verbal reasoning.

Scores on both tests were recorded for each child at each time point, resulting in 4 variables for the Verbal test (V1, V2, V3, V4) and 4 variables for the Non-verbal test (N1, N2, N3, N4).

Please refer to Exercise 19 for the descriptive statistics - means, Standard Deviations and correlations of all the observed variables.

20.3 Worked Example - Testing a latent autoregressive model for WISC Verbal subtest

To complete this exercise, you need to repeat the analysis of Verbal subtests from a worked example below, and answer some questions. Once you are confident, you can repeat the analyses of Non-verbal subtests independently.

Step 1. Reading data

You can continue working in the project created for Exercise 19. Alternatively, download the summary statistics for WISC data stored in the file WISC.RData, save it in a new folder (e.g. “Latent autoregressive model of WISC data”), and create a new project in this new folder.

If it is not in your Environment already, load the data consisting of means (WISC.mean, which we will not use in this exercise), correlations (WISC.corr), and SDs (WISC.SD):

load("WISC.RData")

And produce the WISC covariance matrix from the correlations and SDs:

#  produce WISC covariance matrix
WISC.cov <- crossprod(crossprod(WISC.corr, diag(WISC.SD)), diag(WISC.SD))

varnames=c("V1","V2","V3","V4","N1","N2","N3","N4")
rownames(WISC.cov) <- varnames
colnames(WISC.cov) <- varnames

Step 2. Specifying and fitting the model

Before you start, load the package lavaan:

library(lavaan)

We will fit the following latent auto-regressive path model.

Latent autoregressive model for WISC Verbal subtest.

Figure 20.1: Latent autoregressive model for WISC Verbal subtest.

There are 4 measurement models - one for each of the Verbal scores across 4 time points. These models are based on the classic true-score model:

V = T + E

Each observed verbal score, for example V1, is a sum of two latent variables - the ‘true’ score (T1) and the error of measurement (E1). The same applies to V2, V3 and V4. The paths from each T and E latent variables to the observed measures of V are fixed to 1, as the true-score model prescribes.

With these measurement models in place, the autoregressive relationships pertain to the ‘true’ scores only: T1, T2, T3 and T4.

Let’s code this model using the lavaan syntax conventions, starting with describing the four measurement models:

T1 is measured by V1 (error of measurement E1 is assumed; lavaan will label it .V1)

T2 is measured by V2 (error of measurement E2 is assumed; lavaan will label it .V2)

T3 is measured by V3 (error of measurement E3 is assumed; lavaan will label it .V3)

T4 is measured by V4 (error of measurement E4 is assumed; lavaan will label it .V4)

We then describe the autoregressive relationships between T1, T2, T3 and T4.

T2 is regressed on T1 (residual for T2 is assumed)

T3 is regressed on T2 (residual for T3 is assumed)

T4 is regressed on T3 (residual for T4 is assumed)

That’s it. Try to write the respective statements using the shorthand symbols =~ for is measured by and ~ for is regressed on. You should obtain something similar to this (I called my model T.auto, but you can call it whatever you like):

T.auto <- ' 
# measurement part (by default, the loadings will be fixed to 1)
   T1 =~ V1
   T2 =~ V2
   T3 =~ V3
   T4 =~ V4
# structural part - auto regressions
   T2 ~ T1
   T3 ~ T2
   T4 ~ T3 '

To fit this path model, we need function sem(). We need to pass to it: the model (model=T.auto, or simply T.auto if you put this argument first), and the data. However, because our data is the WISC covariance matrix, we need to pass our matrix to the argument sample.cov, rather than data which is reserved for raw subject data. We also need to pass the number of observations (sample.nobs = 204), because this information is not contained in the covariance matrix.

fit <- sem(T.auto, sample.cov= WISC.cov, sample.nobs=204)

Step 3. Assessing Goodness of Fit with global indices and residuals

To get access to model fitting results stored in object fit1, request the standardized parameter estimates:

summary(fit, standardized=TRUE)
## lavaan 0.6-19 ended normally after 69 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         7
## 
##   Number of observations                           204
## 
## Model Test User Model:
##                                                       
##   Test statistic                                58.473
##   Degrees of freedom                                 3
##   P-value (Chi-square)                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   T1 =~                                                                 
##     V1                1.000                               5.816    1.000
##   T2 =~                                                                 
##     V2                1.000                               6.115    1.000
##   T3 =~                                                                 
##     V3                1.000                               7.322    1.000
##   T4 =~                                                                 
##     V4                1.000                              10.674    1.000
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   T2 ~                                                                  
##     T1                0.754    0.051   14.691    0.000    0.717    0.717
##   T3 ~                                                                  
##     T2                0.905    0.055   16.496    0.000    0.756    0.756
##   T4 ~                                                                  
##     T3                1.162    0.062   18.847    0.000    0.797    0.797
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .V1                0.000                               0.000    0.000
##    .V2                0.000                               0.000    0.000
##    .V3                0.000                               0.000    0.000
##    .V4                0.000                               0.000    0.000
##     T1               33.822    3.349   10.100    0.000    1.000    1.000
##    .T2               18.170    1.799   10.100    0.000    0.486    0.486
##    .T3               22.971    2.274   10.100    0.000    0.428    0.428
##    .T4               41.560    4.115   10.100    0.000    0.365    0.365

Examine the output, starting with the Chi-square statistic. Refer to Exercise 19 for the model output, and you will see that the Chi-square is exactly the same, 58.473 on 3 Degrees of freedom. This is not what we expected. We expected different fit due to estimating more parameters (errors of measurement that we included in the model). Also, the standardized regression coefficients for the ‘true’ scores T2~T1, T3~T2, and T4~T3 are the same as the standardized regression coefficients for the observed scores V2~V1, V3~V2 and V4~V3 in the path model of Exercise 19. To find out where we went wrong, let’s examine the parameter estimates. Ah! The variances of errors of measurement that we assumed for the observed variables (.V1, .V2, .V3 and .V4) have been fixed to zero by lavaan:

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  .V1                0.000                               0.000    0.000
  .V2                0.000                               0.000    0.000
  .V3                0.000                               0.000    0.000
  .V4                0.000                               0.000    0.000

Obviously then, our model is then reduced to the path model in Exercise 19 because it assumes that the errors of measurement are zero (and the observed scores are perfectly reliable).

To specify the model in the way we intended, we need to explicitly free up the variances of measurement errors, using the ~~ syntax convention. We add the following syntax to T.auto, making T.auto2:

T.auto2 <- paste(T.auto, '
                 
                 # free up error variance
                 V1 ~~ V1
                 V2 ~~ V2
                 V3 ~~ V3
                 V4 ~~ V4
                 ')

fit2 <- sem(T.auto2, sample.cov= WISC.cov, sample.nobs=204)
## Warning: lavaan->lav_model_vcov():  
##    Could not compute standard errors! The information matrix could not be 
##    inverted. This may be a symptom that the model is not identified.

It appears we have a problem. The warning message is telling us that the standard errors could not be computed. To investigate the problem further, ask for the summary output:

summary(fit2)
## lavaan 0.6-19 ended normally after 35 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        11
## 
##   Number of observations                           204
## 
## Model Test User Model:
##                                                       
##   Test statistic                                    NA
##   Degrees of freedom                                -1
##   P-value (Unknown)                                 NA
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   T1 =~                                               
##     V1                1.000                           
##   T2 =~                                               
##     V2                1.000                           
##   T3 =~                                               
##     V3                1.000                           
##   T4 =~                                               
##     V4                1.000                           
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   T2 ~                                                
##     T1                1.117       NA                  
##   T3 ~                                                
##     T2                1.196       NA                  
##   T4 ~                                                
##     T3                1.366       NA                  
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .V1               11.002       NA                  
##    .V2                8.833       NA                  
##    .V3                8.016       NA                  
##    .V4               28.797       NA                  
##     T1               22.820       NA                  
##    .T2                0.069       NA                  
##    .T3                4.758       NA                  
##    .T4                0.041       NA

From the output it is clear that our model is not identified, because we have a negative number of degrees of freedom, -1. This means we have 1 more parameter to estimate than the number of sample moments available.

The previous model, T.auto, had 3 degrees of freedom. By asking to estimate 4 error variances, we used up all these degrees of freedom and one more, and we ran out.

Step 4. Constraining the errors of measurement equal across time

To remedy the situation with the model under-identification, we need to reduce the number of parameters. Clearly we cannot hope to estimate all 4 error variances. The most sensible way to proceed is to estimate one common variance for the 4 errors of measurement. This makes perfect conceptual sense because the same Verbal test was used on all 4 measurement occasions, and the error of measurement is the property of the test and therefore should not change over time (see wonderful explanation of this in McDonald’s “Test Theory: A Unified Treatment” 1999 edition, page 68). Therefore, we can constrain the 4 error variances to be equal, thus estimating one parameter instead of four.

We will constrain the error variances to be equal by using a parameter label, arbitrarily we call it e. This will tell lavaan that each of the declared variances have to be the same as another one with that same parameter. So, we replace the last part of the model with the new version:

T.auto3 <- paste(T.auto, '
                 
                 # free up error variance but constrain equal
                 V1 ~~ e*V1
                 V2 ~~ e*V2
                 V3 ~~ e*V3
                 V4 ~~ e*V4
                 ')

And now we can fit the model again and obtain the summary output:

fit3 <- sem(T.auto3, sample.cov= WISC.cov, sample.nobs=204)

summary(fit3, standardized=TRUE)
## lavaan 0.6-19 ended normally after 57 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        11
##   Number of equality constraints                     3
## 
##   Number of observations                           204
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 2.115
##   Degrees of freedom                                 2
##   P-value (Chi-square)                           0.347
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   T1 =~                                                                 
##     V1                1.000                               5.033    0.865
##   T2 =~                                                                 
##     V2                1.000                               5.366    0.879
##   T3 =~                                                                 
##     V3                1.000                               6.725    0.918
##   T4 =~                                                                 
##     V4                1.000                              10.268    0.962
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   T2 ~                                                                  
##     T1                1.010    0.082   12.356    0.000    0.947    0.947
##   T3 ~                                                                  
##     T2                1.186    0.074   15.983    0.000    0.946    0.946
##   T4 ~                                                                  
##     T3                1.375    0.077   17.940    0.000    0.900    0.900
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .V1         (e)    8.494    1.112    7.638    0.000    8.494    0.251
##    .V2         (e)    8.494    1.112    7.638    0.000    8.494    0.228
##    .V3         (e)    8.494    1.112    7.638    0.000    8.494    0.158
##    .V4         (e)    8.494    1.112    7.638    0.000    8.494    0.075
##     T1               25.329    3.529    7.178    0.000    1.000    1.000
##    .T2                2.972    2.053    1.448    0.148    0.103    0.103
##    .T3                4.738    2.081    2.276    0.023    0.105    0.105
##    .T4               19.986    4.278    4.672    0.000    0.190    0.190

Examine the output and answer the following questions.

QUESTION 1. How many parameters are there to estimate? What are they? How many known pieces of information (sample moments) are there in the data? What are the degrees of freedom and how are they calculated?

QUESTION 2. Interpret the chi-square. Do you retain or reject the model?

Step 5. Interpreting and appraising the model results

Now examine the model parameters, and try to answer the following questions.

QUESTION 3. What is the standardized effect of T1 on T2? How does it compare to the observed correlation between V1 and V2? If you have completed Exercise 19, how would you interpret this result?

QUESTION 4. What is the (unstandardized) variance of the measurement error of the Verbal test? Why are the standardized values (Std.all) for this parameter different for the 4 measurements? What does it tell you about the reliability of the Verbal scores?

To summarise, the latent autoregressive model predicts that the standardized indirect effect should get smaller the more distal the outcome is. And now our data support this prediction. The correlations between the ‘true’ scores at the adjacent time points are much stronger (0.9 or above) than the path model in Exercise 19 would have you believe. The reason for misfit of that path model was not the existence of “sleeper” effects, but the failure to account for the error of measurement.

Step 6. Saving your work

After you have finished working with this exercise, save your R script with a meaningful name, for example “Latent autoregressive model of WISC”. To keep all of the created objects, which might be useful when revisiting this exercise, save your entire ‘work space’ when closing the project. Press File / Close project, and select “Save” when prompted to save your ‘Workspace image’.

20.4 Further practice - Testing a latent autoregressive model for WISC Non-verbal subtest

To practice further, fit a latent autoregressive path model to the WISC Non-verbal scores, and interpret the results.

20.5 Solutions

Q1. There are 4 observed variables, therefore 4x(4+1)/2 = 10 sample moments in total (made up from 4 variances and 6 covariances).

The output prints that the “Number of model parameters” is 11, and “number of equality constraints” is 3. The 11 parameters are made up of 3 regression paths (arrows on diagram) for the ‘true’ scores (T variables), plus 3 variances of regression residuals for T2, T3 and T4 variables, plus 1 variance of independent latent variable T1, plus the 4 error variances of V variables. Nominally, this makes 11, but of course there are only 11-3=8 parameters to estimate because we imposed equality constraints on error variances estimating only 1 parameter instead of 4 (3 less parameters than we did in the model T.auto2).

Then, the degrees of freedom is 2 (as lavaan rightly tells you), made up of 10(moments) – 8(parameters) = 2(df).

Q2. The chi-square test (Chi-square = 2.115, Degrees of freedom = 2) suggests accepting the model because it is quite likely (P-value = .347) that the observed data could emerge from the population in which the model is true.

Q3. The standardized effect of T1 on T2 is 0.947 (you can look at Std.lv, which stands for “standardized latent variables”). This is much higher than the observed correlations between V1 and V2, which was 0.717. From the explanation of this latter value in Exercise 19 it follows that 0.717 was an estimate of covariance between ‘true’ Verbal scores at ages 6 and 7 rather than their correlation. Because the scores had error of measurement, the (standardized) variances of their ‘true’ scores were lower than 1, and therefore the covariance of 0.717 equated to a much higher correlation of ‘true’ scores, which is now estimated as 0.947.

Q4. The unstandardized variance of the measurement error of the Verbal test is 8.494. These values are identical across 4 time points, and are marked with label (e) in the output. However, the standardized values (Std.all) for this parameter are different for the 4 measurements:

Variances:
                 Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
 .V1         (e)    8.494    1.112    7.638    0.000    8.494    0.251
 .V2         (e)    8.494    1.112    7.638    0.000    8.494    0.228
 .V3         (e)    8.494    1.112    7.638    0.000    8.494    0.158
 .V4         (e)    8.494    1.112    7.638    0.000    8.494    0.075

This is because the test score variance is composed of ‘true’ and ‘error’ variance, and even though the error variance was constrained equal, the ‘true’ score variance changes over time because the ‘true’ score is the property of the person, and people change. Std.all values show you the proportion of error in the observed score at each time point, and because the values are getting smaller, we can conclude that the proportion of ‘true’ score variance increases. Presumably, the ‘true’ score variance at time 4 (age 11) was the largest.

From these results, it is easy to calculate the reliability of the Verbal test scores. By definition, reliability is the proportion of the variance in observed score due to ‘true’ score. As the Std.all values give the proportion of error, reliability is computed as 1 minus the given proportion of error. So, reliability of V1 is 1-0.251=0.749, reliability of V2 is 1-0.228=0.772, reliability of V3 is 1-0.158=0.842 and reliability of V4 is 1-0.075=0.925. As Roderick McDonald (1999) pointed out, reliability will vary according to a population sampled, as it is a joint property of the test and the examinees. On the contrary, the error of measurement is a property of the test only, and this why the ultimate goal of reliability analysis is to obtain an estimate of the variance of E in the test score metric. In this data analysis example, this estimate is e=8.494.