6.1 Global Identifiability

A model is (globally) identifiable if the parameter values uniquely determine the probability distribution of the data and the probability distribution of the data uniquely determines the parameter values. (Everitt & Howell, 2005)

Important aspects:

  • parameter estimates can uniquely determine the probability distribution.

  • probability distribution of the data uniquely determines the parameter values.

Important aspects about CDMs(Xu & Zhang, 2016):

  • For global identifiability we need distinct sets of model parameters to obtain distinct distribution of observed responses.
  • If the data does not differentiate between different parameterizations, we may loose the consistency of the parameter estimates.
  • Q-matrix play an important role in CDM estimation.

Idntifiabiliyt Vs. Consistently Estimable

  • Identifiability is necessary and sufficient condition for statistical consistency of a parameter estimate.Nonetheless, identifiablity conditions alone are not always sufficient for consistent estimation. We can define a parameter as consistently estimabel if it is possible to create a consistent estimator for that parameter, i.e., for parameter \(\beta\), there exists a \(\hat{\beta}_N\) such that \(\hat{\beta}-\beta \rightarrow0\) in probability as the sample size \(N\rightarrow \infty\)

see Xu (2019) for more detail.

6.1.1 What parameters we need to consider in CDMs?

In CDMs, the model parameters are:

  • item parameters (denoted by \(\Theta\),e.g., guess and slip for the DINA model)

  • population proportion parameters (denoted by \(\mathbf{p}\)), i.e., proportion of students in each latent class.

Note. Person parameters (attribute profiles) are not model parameters in random-effect CDMs. For instance, when we use EM algorithm, we obtain marginalize the likelihood, and by marginalizing the likelihood we will remove the person parameters from the model.

6.1.2 Global identifiability in CDMs

A CDM is identifiable if all item parameters and population proportion parameters are identified. More formally,

Definition 1 (Xu, 2019)

\((\Theta,\mathbf{p})\) is said to be identifiable if for any \((\bar{\Theta},\bar{\mathbf{p}})\neq ({\Theta},{\mathbf{p}})\), there exists at least one response pattern \(\mathbf{x}\) such that

\[ P(\mathbf{x}|Q,\bar{\Theta},\bar{\mathbf{p}})\neq P(\mathbf{x}|Q,{\Theta},{\mathbf{p}}) \]

Exercise 6.1 The definition means for different set of CDM parameters, we should have different probabilities for particular response vector.

Suppose a test with three items measures two attributes and the Q-matrix is given below:

\[ \begin{bmatrix} 1 & 0 \\ 1 & 0 \\ 1 & 1 \end{bmatrix} \]

For simplicity, suppose all items have the same slip parameters, denoted by \(s\) and the same guessing parameters, denoted by \(g\).

Please show that \(p(00)\) and \(p(01)\) are not identifiable.

Click for Answer

It can be shown that the conditional probability of all response vectors for \(00\) and \(01\) are identical. For example, \[\begin{align} P(\mathbf{X}=(000)|\alpha=(00)) &= g^3 \\ P(\mathbf{X}=(000)|\alpha=(01)) &= g^3 \\ \end{align}\] This suggests that from a response vector, there is no way to distinguish \(00\) from \(01\), and therefore, \(p(00)\) and \(p(01)\) are not identifiable.

More formally, \[\begin{align} P(\mathbf{x}|Q,{\Theta},{\mathbf{p}}) =&\sum_c P(\mathbf{x}|\alpha_c,Q,{\Theta}) p(\alpha_c)\\ =&P(\mathbf{x}|\alpha=(00),Q,{\Theta}) p(\alpha=(00))+\\ &P(\mathbf{x}|\alpha=(10),Q,{\Theta}) p(\alpha=(10))+\\ &P(\mathbf{x}|\alpha=(01),Q,{\Theta}) p(\alpha=(01))+\\ &P(\mathbf{x}|\alpha=(11),Q,{\Theta}) p(\alpha=(11)) \end{align}\]

Because \(P(\mathbf{x}|\alpha=(00),Q,{\Theta})=P(\mathbf{x}|\alpha=(01),Q,{\Theta})\) for all \(\mathbf{x}\), \(p(\alpha=(00))\) and \(p(\alpha=(01))\) cannot be uniquely determined.

References

Everitt, B., & Howell, D. C. (Eds.). (2005). Encyclopedia of statistics in behavioral science. John Wiley & Sons.
Xu, G. (2019). Identifiability and Cognitive Diagnosis Models (M. von Davier & Y.-S. Lee, Eds.; pp. 333–357). Springer International Publishing. http://link.springer.com/10.1007/978-3-030-05584-4_16
Xu, G., & Zhang, S. (2016). Identifiability of diagnostic classification models. Psychometrika, 81, 625–649.