7.5 Relative Model-Data Fit at Test Level : Practice Session in R

In this session, we will use GDINA R-package to obtain relative fit indices for different CDM models.

To evaluate model data fit,

  • first, we need to fit our competing CDMs to the data. Selection of competing CDMs may depend on item types, assumptions on attribute structure, and other relevant factors.

  • Second, obtain the relative fit measures.

  • Third, compare the relative fit measures.

Let us practice obtaining Relative fit measures for different CDM models.

Preparations - Please download data1 and Q1 from blackboard.

First, let us estimate the CDMs.

  • In the first model (fit 1), let us consider DINO model.
  • In the second model (fit 2), let us consider GDINA model.
  • In the third model(fit 3), let us consider different item specific CDMs.
Code
#Load the GDINA package
library(GDINA)
data1 <- read.table(file = "./data/data1.dat",header = TRUE)
Q1 <- read.table(file = "./data/Q1.txt")
# Fit the DINO model
# verbose can be used to control what
# to be printed during the estimation
fit1 <- GDINA(dat = data1, Q = Q1, model = "DINO", verbose = 0)

summary and several other functions can be used to print some test level relative model-data fit statistics:

Code
summary(fit1)
## 
## Test Fit Statistics
## 
## Loglik = -7524.80 
## 
## AIC    = 15123.60  | penalty [2 * p]  = 74.00 
## BIC    = 15298.61  | penalty [log(n) * p]  = 249.00 
## CAIC   = 15335.61  | penalty [(log(n) + 1) * p]  = 286.00 
## SABIC  = 15181.11  | penalty [log((n + 2)/24) * p]  = 131.50 
## 
## No. of parameters (p)  = 37 
##   No. of estimated item parameters =  30 
##   No. of fixed item parameters =  0 
##   No. of distribution parameters =  7 
## 
## Attribute Prevalence
## 
##    Level0 Level1
## A1   0.64   0.36
## A2   0.53   0.47
## A3   0.76   0.24

If we want to compare two nested models, we can use likelihood ratio test. AIC, BIC, CAIC, and SABIC can also be used when two models are fitted to the same data.

To compare whether the DINO model can fit the data as well as the G-DINA model, we use anova function with two model fit objects as the inputs:

Code
# fit1 is based on the DINO model
# fit2 is based on the G-DINA model
fit2 <- GDINA(dat = data1,Q = Q1,model = "GDINA",verbose = 0)
anova(fit1,fit2)
## 
## Information Criteria and Likelihood Ratio Test
## 
##      #par   logLik Deviance      AIC      BIC     CAIC    SABIC chisq df p-value
## fit1   37 -7524.80 15049.60 15123.60 15298.61 15335.61 15181.11 186.7 30  <0.001
## fit2   67 -7431.45 14862.91 14996.91 15313.81 15380.81 15101.04

We can also compare the fit of various CDM combinations with that of G-DINA model:

Code
# Fit different CDMs to different items
models <- c(rep("GDINA",4),"LLM","ACDM","GDINA","LLM","RRUM","ACDM","RRUM","ACDM","DINA","RRUM","RRUM")
fit3 <- GDINA(dat = data1, Q = Q1, model = models,verbose = 0)
anova(fit1,fit2,fit3)
## 
## Information Criteria and Likelihood Ratio Test
## 
##      #par   logLik Deviance      AIC      BIC     CAIC    SABIC chisq df p-value
## fit1   37 -7524.80 15049.60 15123.60 15298.61 15335.61 15181.11 186.7 30  <0.001
## fit2   67 -7431.45 14862.91 14996.91 15313.81 15380.81 15101.04                 
## fit3   50 -7443.62 14887.24 14987.24 15223.73 15273.73 15064.95 24.33 17    0.11
## 
## Notes: In LR tests, models were tested against fit2 
##        LR test(s) do NOT check whether models are nested or not.

Lower relative fit measures indicate better model-data fit