12.8 Stepwise method

To address the both limitations, Ma & de la Torre (2020) proposed a stepwise procedure, which use the Wald test as the central component for determining the best q-vector for each item.

To show how Wald test can be used for q-vector validation, assume an item measures three attributes. We want to use Wald test to test whether Attribute 1 is required statistically, the null hypothesis (Attribute 1 is not necessary) would be

\[\begin{equation} \begin{bmatrix} 1&-1&0&0&0&0&0&0\\ 0&0&1&0&-1&0&0&0\\ 0&0&0&1&0&-1&0&0\\ 0&0&0&0&0&0&-1&1 \end{bmatrix} \times \begin{bmatrix} {P}(000)\\ {P}(100)\\ {P}(010)\\ {P}(001)\\ {P}(110)\\ {P}(101)\\ {P}(011)\\ {P}(111) \end{bmatrix}=\mathbf{0}. \end{equation}\] ## Stepwise method (Cont’d)

The stepwise procedure is implemented item by item. Specifically, the first required attribute is chosen based on the PVAF, whereas choosing the next required attributes, if any, is based on both the Wald test and the PVAF. The Wald test serves as a hypothesis test, and the PVAF functions as an effect size measure, which can be critical when more than one attribute is deemed necessary based on the Wald test. More specifically, for item \(j\), the algorithm is conducted as follows:

Step 1

Define \(\Omega=\{1,\ldots,K\}\) as a set consisting of the indices for all \(K\) attributes. Also, let \(A\) be a set consisting of the indices for all the required attributes identified during the validation process, and \(B=\Omega\setminus A\). The attributes indexed in set \(B\) are called target attributes in that their necessity needs to be examined. Initialize \(A=\emptyset\), and thus \(B=\{1,\ldots,K\}\). Define a q-vector search bank \(C\) consisting of \(K\) single-attribute competing q-vectors. Replace the provisional q-vector (i.e., \(\mathbf{q}_{jh}\) in the Q\(_\text{C}\)-matrix) with each of the competing q-vectors in \(C\), and calculate their associated PVAFs. The target attribute required by the competing q-vector producing the largest PVAF is defined as a required attribute. Assume this attribute is attribute \(k'\), and update set \(A\) and \(B\): \(A=\{k'\}\) and \(B=\Omega\setminus A\).

Step 2

Check whether the q-vector requiring the attributes indexed in set \(A\) has a PVAF greater than 0.95. If yes, the validation process terminates; otherwise, update the search bank \(C\) so that each competing q-vector requires all attributes indexed in set \(A\) and one target attribute indexed in set \(B\). As a result, there are at least two ones in each competing q-vector in this step. For example, assume we have three attributes and in the first step, \((0,1,0)\) had the largest PVAF compared with \((1,0,0)\) and \((0,0,1)\). Therefore, \(A=\{2\}\), and \(B=\{1,3\}\). The competing q-vectors include \((1,1,0)\) and \((0,1,1)\), both of which require attribute 2 as it is indexed in set \(A\). Each of the competing q-vector also requires an target attribute (i.e., attributes 1 and 3 for the first and second competing q-vectors, respectively).}} The Wald test is used to examine whether or not the target attribute is statistically necessary for each competing q-vector. If none of the target attributes is required, the validation process terminates; if at least one target attribute is required, the one specified in the competing q-vector with the largest PVAF is assumed to be required, and the associated q-vector is the best among all current competing q-vectors. The index of the target attribute in this q-vector is added to set \(A\) and removed from set \(B\). The necessity of the required attributes except the target one in this competing q-vector is examined using the Wald test as well. If any of them are deemed unnecessary statistically after the target attribute has been included, their indices are removed from set \(A\) to set \(B\). Step 2 is repeated until no new index can be added to or removed from sets \(A\) and \(B\).

Step 1 and Step 2 are implemented for each category of each item. The former aims to determine the first required attribute using the PVAF, and the latter attempts to identify, if any, other required attributes using the Wald test, in conjunction with the PVAF when necessary. After Step 2 ends, all attributes indexed in set \(A\) are believed to be required for the studied category. This process is said to be implemented in a stepwise manner in that the necessity of the attributes is evaluated iteratively, similar to the stepwise procedure for model selection in linear regression. It should be noted that at the beginning of Step 2 , the PVAF of the current q-vector is calculated and compared with 0.95. This evaluation is not mandatory, but it is useful when sample size is large, in which condition, the hypothesis test tends to reject the null hypothesis and result in over-specified q-vectors.

Flowchart

## Stepwise method (Cont’d) To conduct Q-matrix validation using stepwise method, change method to “wald” in the Qval function:

Code
stepwise <- Qval(mod1,method = "wald",eps = .95)
stepwise
## 
## Q-matrix validation based on Stepwise Wald test 
## 
## Suggested Q-matrix: 
## 
##    A1 A2 A3 A4 A5
## 1  1  0  0  0  0 
## 2  0  1  0  0  0 
## 3  0  0  1  0  0 
## 4  0  0  0  1  0 
## 5  0  0  0  0  1 
## 6  1  0  0  0  0 
## 7  0  1  0  0  0 
## 8  0  0  1  0  0 
## 9  0  0  0  1  0 
## 10 0  0  0  0  1 
## 11 1  1  0  0  0 
## 12 1  0  1  0  0 
## 13 1  0  0  1  0 
## 14 1  0  0  0  1 
## 15 0  1  1  0  0 
## 16 0  1  0  1  0 
## 17 0  1  0  0  1 
## 18 0  0  1  1  0 
## 19 0  0  1  0  1 
## 20 0  0  0  1  1 
## 21 1  1  1  0  0 
## 22 1  1  0  1  0 
## 23 1  1  0  0  1 
## 24 1  0  1  1  0 
## 25 1  0  1  0  1 
## 26 1  0  0  1  1 
## 27 0  1  1  1  0 
## 28 0  1  1  0  1 
## 29 0  1  0  1  1 
## 30 0* 0* 1* 1  1*
## Note: * denotes a modified element.

References

Ma, W., & de la Torre, J. (2020). An empirical q-matrix validation method for the sequential generalized DINA model. The British Journal of Mathematical and Statistical Psychology, 73(1), 142–163. https://doi.org/10.1111/bmsp.12156