8 interaction effects between endogenous variables

library(modsem)

8.1 The Problem

Interaction effects between two endogenous (i.e., dependent) variables work as you would expect for the product indicator methods ("dblcent", "rca", "ca", "uca"). For the lms- and qml approach however, it is not as straight forward.

The lms- and qml approach can (by default) handle interaction effects between endogenous and exogenous (i.e., independent) variables, but not interaction effects between two endogenous variables. When there is an interaction effect between two endogenous variables, the equations cannot easily be written in ‘reduced’ form – meaning that normal estimation procedures won’t work.

8.2 The Solution

This being said, there is a work-around for these limitations for both the lms- and qml-approach. In essence, the model can be split into two parts, one linear and one non-linear. Basically, you can replace the covariance matrix used in the estimation of the non-linear model, with the model-implied covariance matrix from a linear model. Thus you can treat an endogenous variable as if it were exogenous – given that it can be expressed in a linear model.

8.3 Example

Let’s consider the the theory of planned behaviour (TPB) where we wish to estimate the quadratic effect of INT on BEH (INT:INT). With the following model:

tpb <- ' 
# Outer Model (Based on Hagger et al., 2007)
  ATT =~ att1 + att2 + att3 + att4 + att5
  SN =~ sn1 + sn2
  PBC =~ pbc1 + pbc2 + pbc3
  INT =~ int1 + int2 + int3
  BEH =~ b1 + b2

# Inner Model (Based on Steinmetz et al., 2011)
  INT ~ ATT + SN + PBC
  BEH ~ INT + PBC 
  BEH ~ INT:INT
'

Since INT is an endogenous variable, its quadratic term (i.e., an interaction effect with itself) would include two endogenous variables. Thus we would ordinarily not be able to estimate this model using the lms- or qml-approach. However, we can split the model into two parts, one linear and one non-linear. While INT is an endogenous variable, it can be expressed in a linear model – since it is not affected by any interaction terms:

tpb_linear <- 'INT ~ PBC + ATT + SN'

We could then remove this part from the original model, giving us:

tpb_nonlinear <- ' 
# Outer Model (Based on Hagger et al., 2007)
  ATT =~ att1 + att2 + att3 + att4 + att5
  SN =~ sn1 + sn2
  PBC =~ pbc1 + pbc2 + pbc3
  INT =~ int1 + int2 + int3
  BEH =~ b1 + b2

# Inner Model (Based on Steinmetz et al., 2011)
  BEH ~ INT + PBC 
  BEH ~ INT:INT
'

We could now just estimate the non-linear model, since INT now is an exogenous variable. This would however not incorporate the structural model for INT. To address this, we can make modsem replace the covariance matrix (phi) of (INT, PBC, ATT, SN) with the model-implied covariance matrix from the linear model, whilst estimating both models simultaneously. To acheive this, we can use the cov.syntax argument in modsem:

est_qml <- modsem(tpb_nonlinear, data = TPB, cov.syntax = tpb_linear, method = "qml")
est_lms <- modsem(tpb_nonlinear, data = TPB, cov.syntax = tpb_linear, method = "lms")
summary(est_lms)
#> Estimating null model
#> EM: Iteration =     1, LogLik =   -26393.22, Change = -26393.223
#> EM: Iteration =     2, LogLik =   -26393.22, Change =      0.000
#> 
#> modsem (version 1.0.1):
#>   Estimator                                          LMS
#>   Optimization method                          EM-NLMINB
#>   Number of observations                            2000
#>   Number of iterations                                82
#>   Loglikelihood                                -23780.84
#>   Akaike (AIC)                                  47669.69
#>   Bayesian (BIC)                                47972.13
#>  
#> Numerical Integration:
#>   Points of integration (per dim)                     24
#>   Dimensions                                           1
#>   Total points of integration                         24
#>  
#> Fit Measures for H0:
#>   Loglikelihood                                   -26393
#>   Akaike (AIC)                                  52892.45
#>   Bayesian (BIC)                                53189.29
#>   Chi-square                                       66.27
#>   Degrees of Freedom (Chi-square)                     82
#>   P-value (Chi-square)                             0.897
#>   RMSEA                                            0.000
#>  
#> Comparative fit to H0 (no interaction effect)
#>   Loglikelihood change                           2612.38
#>   Difference test (D)                            5224.76
#>   Degrees of freedom (D)                               1
#>   P-value (D)                                      0.000
#>  
#> R-Squared:
#>   BEH                                              0.235
#>   INT                                              0.364
#> R-Squared Null-Model (H0):
#>   BEH                                              0.210
#>   INT                                              0.367
#> R-Squared Change:
#>   BEH                                              0.025
#>   INT                                             -0.002
#> 
#> Parameter Estimates:
#>   Coefficients                            unstandardized
#>   Information                                   expected
#>   Standard errors                               standard
#>  
#> Latent Variables:
#>                   Estimate  Std.Error  z.value  Pr(>|z|)
#>   INT =~         
#>     int1             1.000                              
#>     int2             0.915      0.016    58.88     0.000
#>     int3             0.807      0.015    55.03     0.000
#>   ATT =~         
#>     att1             1.000                              
#>     att2             0.878      0.012    70.95     0.000
#>     att3             0.789      0.012    65.48     0.000
#>     att4             0.695      0.011    61.41     0.000
#>     att5             0.887      0.013    70.22     0.000
#>   SN =~          
#>     sn1              1.000                              
#>     sn2              0.888      0.017    51.87     0.000
#>   PBC =~         
#>     pbc1             1.000                              
#>     pbc2             0.913      0.013    68.78     0.000
#>     pbc3             0.801      0.012    66.27     0.000
#>   BEH =~         
#>     b1               1.000                              
#>     b2               0.959      0.033    28.92     0.000
#> 
#> Regressions:
#>                   Estimate  Std.Error  z.value  Pr(>|z|)
#>   BEH ~          
#>     INT              0.196      0.026     7.66     0.000
#>     PBC              0.238      0.022    10.72     0.000
#>     INT:INT          0.129      0.018     7.30     0.000
#>   INT ~          
#>     PBC              0.219      0.029     7.48     0.000
#>     ATT              0.210      0.026     8.22     0.000
#>     SN               0.171      0.027     6.25     0.000
#> 
#> Intercepts:
#>                   Estimate  Std.Error  z.value  Pr(>|z|)
#>     int1             1.006      0.020    49.44     0.000
#>     int2             1.006      0.019    52.89     0.000
#>     int3             0.999      0.017    57.39     0.000
#>     att1             1.010      0.024    42.52     0.000
#>     att2             1.003      0.021    47.52     0.000
#>     att3             1.013      0.019    52.17     0.000
#>     att4             0.996      0.018    56.20     0.000
#>     att5             0.989      0.021    46.28     0.000
#>     sn1              1.002      0.024    42.13     0.000
#>     sn2              1.007      0.021    47.24     0.000
#>     pbc1             0.994      0.023    42.91     0.000
#>     pbc2             0.981      0.022    45.40     0.000
#>     pbc3             0.988      0.019    50.90     0.000
#>     b1               0.996      0.024    42.05     0.000
#>     b2               1.015      0.022    45.41     0.000
#>     BEH              0.000                              
#> 
#> Covariances:
#>                   Estimate  Std.Error  z.value  Pr(>|z|)
#>   PBC ~~         
#>     ATT              0.673      0.021    31.68     0.000
#>     SN               0.673      0.022    30.89     0.000
#>   ATT ~~         
#>     SN               0.624      0.019    33.61     0.000
#> 
#> Variances:
#>                   Estimate  Std.Error  z.value  Pr(>|z|)
#>     int1             0.161      0.009    18.50     0.000
#>     int2             0.161      0.008    20.52     0.000
#>     int3             0.170      0.007    23.49     0.000
#>     att1             0.167      0.007    23.46     0.000
#>     att2             0.150      0.006    24.35     0.000
#>     att3             0.160      0.006    26.60     0.000
#>     att4             0.162      0.006    27.54     0.000
#>     att5             0.159      0.006    25.04     0.000
#>     sn1              0.178      0.015    12.01     0.000
#>     sn2              0.157      0.012    13.23     0.000
#>     pbc1             0.145      0.008    18.56     0.000
#>     pbc2             0.160      0.007    21.88     0.000
#>     pbc3             0.154      0.006    24.03     0.000
#>     b1               0.185      0.020     9.25     0.000
#>     b2               0.136      0.018     7.45     0.000
#>     BEH              0.475      0.024    19.45     0.000
#>     PBC              0.956      0.018    53.02     0.000
#>     ATT              0.993      0.014    68.94     0.000
#>     SN               0.983      0.015    65.75     0.000
#>     INT              0.481      0.019    25.00     0.000