8.3 Item-level Relative Fit measures

Item level fit measure in CDM is mostly discussed based on the parsimony of the model. The subsequent discussion in this section is based on Sorrel et al. (2017).

8.3.0.1 Parsimony Principle:

  • Simplicity in model selection: Choose the simplest model from a group of models that fit equally well.

  • General CDMs are more complex, thus requiring a larger sample size to be estimated reliably.

  • Reduced models have parameters with a more straightforward interpretation.

  • Appropriate reduced models lead to better attribute classification accuracy than the saturated model, particularly when the sample size is small and the item quality is poor.

  • Implications of Overfitting:

    • Overfitting may lead to poor generalization performance on new data.
    • Some residual variation from calibration data may be captured by overly complex models.

8.3.0.2 Preference for Reduced Models:

General CDMs are not always the best choice due to several reasons:

  • Complexity: General CDMs require larger sample sizes for reliable estimation.
  • Interpretation: Reduced models offer parameters with more straightforward interpretations.
  • Attribute Classification: Reduced models lead to better classification accuracy than saturated models, especially with small sample sizes and poor item quality.

8.3.0.3 Research Findings:

  • De la Torre and Lee (2013) highlight the advantages of reduced models.
  • Ma et al. (2016) found that combinations of different reduced models, determined by the Wald test, yield more accurate classification than unrestricted models like the G-DINA model.

In most cases, we can consider a general CDM to explore the item level model fit. In our case, we may consider GDINA model. Item level model comparison aims to evaluate whether the G-DINA model can be replaced by reduced CDMs without a significant loss in model data fit for each item.

  • Fit the G-DINA model to the data
  • Conduct Wald test

8.3.0.4 Methods for Relative fits

Sorrel et al. (2017) mentioned about three different methods for performing relative fit comparisons.

  • Likelihood Ratio(LR) test,

  • Lagrange multiplier (LM) test

  • Wald test.

These methods, in CDM context, performs the following hypothesis tests,

H0: The reduced model is the true model.

Ha: The more general model is true model.

Here, H0 defines the restricted parameter space i.e., the parameter constraints that will allow us to obtain the reduced model from the saturated models.

For an example, let us assume that item \(j\) measures two attributes in a LLM model. In such case, we are using the logit link in GDINA model, and the interactions term under H0 is 0.

Sorrel et al. (2017) concluded that the three relative fit measures are assymptotically \(\chi^2\) distributed with \(2^{K^*_j}-p\) degrees of freedom, where \(p\) is the number of parameters of reduced model.

Furthermore, all the three methods for relative fit comparisons are asymptotically equivalent.

Let us assume we have an item which measures 2 attributes and we have item success probability for each latent group. How do we know if the item is likely to conform to a DINA, DINO or A-CDM?

References

Sorrel, M. A., Abad, F. J., Olea, J., Torre, J. de la, & Barrada, J. R. (2017). Inferential item-fit evaluation in cognitive diagnosis modeling. Applied Psychological Measurement, 41(8), 614–631.