Reliability
In this chapter, three methods for estimating internal reliability (\(\rho_{XX'}\)) are covered: coefficient \(\alpha\), KR20, and KR21.
The standard error of measurement estimates of a test can be obtained based on the reliability estimates.
In addition, the standard error of measurement estimates can be used for obtaining the confidence intervals for true scores of the subjects.
4.3 Coefficient \(\alpha\)
The coefficient \(\alpha\) provides the lower bound of the reliability of a test.
\[\rho_{XX'} \geq \alpha = \frac{N}{N-1} \left[{\frac{\sigma^2_X - \sum_{j=1}^N \sigma^2_{Y_j}}{\sigma^2_X}}\right]\] where,
N: the number of subtests (usually, items).
\(\sigma^2_X\): the total score variance of the whole test.
\(\sigma^2_{Y_j}\): the variance of the observed scores on the \(j\)-th subtest.
In practice, we often treat each item as a subtest, so \(N\) is the total number of items, and \(\sigma^2_{Y_j}\) are the score variances of the individual items.
Let’s create a function that calculates the coefficient \(\alpha\):
coeff_alpha <- function(responses) {
# Get number of items (N) and individuals
n_items <- ncol(responses)
n_persons <- nrow(responses)
# Get individual total scores
x <- rowSums(responses)
# Get observed-score variance of whole test (X)
var_x <- var(x) * (n_persons - 1) / n_persons
# Get observed-score variance of each item (Y_j)
var_y <- numeric(n_items)
for (j in 1:n_items) {
var_y[j] <- var(responses[, j]) * (n_persons - 1) / n_persons
}
# Apply the alpha formula
alpha <- (n_items / (n_items - 1)) * (1 - sum(var_y) / var_x)
alpha
}
coeff_alpha(resp)
## [1] 0.7435825
The obtained coefficient \(\alpha\) from resp
data set is 0.7436.
4.4 KR20
The Kuder-Richardson formula-20 (KR20) is a special case of the coefficient \(\alpha\). If each item is dichotomous, coefficient \(\alpha\) becomes the KR20.
\[KR20 = \frac{N}{N-1} \left[{\frac{\sigma^2_X - \sum_{j=1}^N p_j(1-p_j)}{\sigma^2_X}}\right]\]
where, \(p_j\) is the proportion of correct answers (\(Y_j = 1\)) for item \(j\).
- Note: If \(Y_j\) is dichotomous, then \(\sigma^2_{Y_j} = p_j (1 - p_j)\).
The below function returns the KR20:
KR20 <- function(responses) {
# Get number of items (N) and individuals
n_items <- ncol(responses)
n_persons <- nrow(responses)
# get p_j for each item
p <- colMeans(responses)
# Get total scores (X)
x <- rowSums(responses)
# observed score variance
var_x <- var(x) * (n_persons - 1) / n_persons
# Apply KR-20 formula
rel <- (n_items / (n_items - 1)) * (1 - sum(p * (1 - p)) / var_x)
rel
}
KR20(resp)
## [1] 0.7435825
Note that the sample variances are calculated by \(S^2_X = \frac{\sum_{i = 1}^{N} (X_i - \bar{X})^2}{n}\), instead of \(S^2_X = \frac{\sum_{i = 1}^{N} (X_i - \bar{X})^2}{n - 1}\).
4.5 KR21
The Kuder-Richardson formula-21 (KR21) is given by:
\[KR21 = \frac{N}{N-1} \left[ {\frac{\sigma_X^2 - N\bar{p}(1-\bar{p})}{\sigma^2_X}} \right]\]
where, \(\bar{p}\) is the average of item difficulties.
In general, KR20 \(\geq\) KR21. When the item difficulties are all equal, then KR20 = KR21.
The function below returns the KR21:
KR21 <- function(responses) {
# Get number of items (N) and individuals
n <- ncol(responses)
n_persons <- nrow(responses)
# Get overall item difficulty - pbar
p_bar <- mean(colMeans(responses))
# Observed score variance
x <- rowSums(responses)
var_x <- var(x) * (n_persons - 1) / n_persons
# Apply KR-21 formula
rel <- (n / (n-1)) * (1 - (n * p_bar * (1 - p_bar)) / var_x)
rel
}
KR21(resp)
## [1] 0.721086
4.6 Standard error of measurement
Based on the reliability estimates, we can also estimate the standard error of measurement of the test, given by
\[\sigma_E = \sigma_X \sqrt{1 - \rho_{XX'}}\] where, \(\sigma_X\) is the standard deviation of the total score, and \(\rho_{XX'}\) is the reliability.
The below function returns the standard error of measurement estimate with the reliability estimate of the user’s choice.
se_m <- function(responses, rel_est = "alpha") {
# The number of subjects
n_persons <- nrow(responses)
# Get the total score and its standard deviation
total <- rowSums(responses)
sd_x <- sqrt(var(total) * (n_persons - 1) / n_persons)
if (rel_est == "alpha") {
rel <- coeff_alpha(responses)
} else if (rel_est == "KR20") {
rel <- KR20(responses)
} else if (rel_est == "KR21") {
rel <- KR21(responses)
} else {
error('rel.est must be either alpha, KR20, or KR21')
}
se <- sd_x * sqrt(1 - rel)
se
}
se_m(responses = resp, rel_est = "KR20")
## [1] 2.928571
## [1] 3.054337
4.7 Your turn
Suppose you want to obtain the 95% confidence intervals for true scores of the examinees using the reliability estimate based on coefficient \(\alpha\). Complete the below code.
sem_alpha <-
total_score <-
z <-
ci_lower <- ci_upper <- numeric(100)
for (i in 1:100) {
ci_lower[i]
ci_upper[i]
}
hints:
Use
se_m()
function to obtain \(\sigma_E\).Use
rowSums()
function to obtain total scores \(X\) of all examinees.The z-score corresponding to the 95% confidence interval, \(z_{\alpha / 2}\), can be obtained by:
- At each iteration in the for loop, replace the \(i\)-th elements of
ci_lower
andci_upper
by \(X_i - z_{\alpha / 2} \sigma_E\) and \(X_i + z_{\alpha / 2} \sigma_E\).