Week 4 CTT Item Analysis
In this chapter, students will learn to perform and write a function for CTT item analysis in R.
4.1 Response data
Let’s start with importing a data set using read.table
function.
resp <- read.table('https://raw.githubusercontent.com/sunbeomk/PSYC490/main/resp.txt',
header = F,
sep = "\t")
The first argument is the file name with its location. If you have a
.txt
file in your local computer, then you can set your working directory to where your file exists, and import the data file.The second argument
header
is a logical value indicating whether the file contains the names of the variables as its first line.The argument
sep
shows how values on each line of the file are separated. For example,sep = ","
means that the separator is a comma.sep = "\t"
means that the separator is a tab.
The imported data set resp
contains 100 subjects’ dichotomous responses to 40 GRE questions. Each row corresponds to subject, and each column corresponds to item. The response is coded as 1 if correct. Let’s check some properties of the imported data set.
4.2 CTT Item Analysis
4.2.2 Item difficulty
The item difficulty in CTT can be obtained by calculating the proportion of correct answers of each item.
\[p_j = \frac{\sum_{i=1}^{n}X_{ij}}{n}\]
Since the correct answers are coded as 1, the column means will give us the proportion of correct, \(p\), which is the CTT item difficulty of the \(j\)-th item.
4.2.3 Item discrimination
The item discrimination in CTT can be obtained by the point biserial correlation between the item response and the total score.
\[r_{j,pbis} = \left[ \frac{\bar{X}_1 - \bar{X}_0}{S_X} \right] \sqrt{\frac{n_1 n_0}{n(n-1)}}\]
When X is 0/1 and Y is continuous, \(r_{j, pbis}\) is equal to the Pearson correlation between X and Y. Let’s obtain the item discrimination of the resp
data set.
n_items <- ncol(resp) # number of items
total_score <- rowSums(resp) # total score
item_disc <- numeric(n_items) # output vector
for (j in 1:n_items) { # sequence
item_disc[j] <- cor(total_score, resp[, j]) # body
}
First, we saved the number of items and the total score as objects
n_items
andtotal_score
, respectively.Before running the for loop, we created an output object
item_disc
which is a zero vector length ofn_items
.Inside the for loop, we replace the \(j\)-th element of the output vector
item_disc
with the pearson correlation between the \(j\)-th item (resp[, j]
) and thetotal_score
. The \(j\)-th element ofitem_disc
is the item discrimination of the \(j\)-th items.