# Reliability

• In this chapter, three methods for estimating internal reliability ($$\rho_{XX'}$$) are covered: coefficient $$\alpha$$, KR20, and KR21.

• The standard error of measurement estimates of a test can be obtained based on the reliability estimates.

• In addition, the standard error of measurement estimates can be used for obtaining the confidence intervals for true scores of the subjects.

## 4.3 Coefficient $$\alpha$$

The coefficient $$\alpha$$ provides the lower bound of the reliability of a test.

$\rho_{XX'} \geq \alpha = \frac{N}{N-1} \left[{\frac{\sigma^2_X - \sum_{j=1}^N \sigma^2_{Y_j}}{\sigma^2_X}}\right]$ where,

• N: the number of subtests (usually, items).

• $$\sigma^2_X$$: the total score variance of the whole test.

• $$\sigma^2_{Y_j}$$: the variance of the observed scores on the $$j$$-th subtest.

In practice, we often treat each item as a subtest, so $$N$$ is the total number of items, and $$\sigma^2_{Y_j}$$ are the score variances of the individual items.

Let’s create a function that calculates the coefficient $$\alpha$$:

coeff_alpha <- function(responses) {
# Get number of items (N) and individuals
n_items <- ncol(responses)
n_persons <- nrow(responses)
# Get individual total scores
x <- rowSums(responses)
# Get observed-score variance of whole test (X)
var_x <- var(x) * (n_persons - 1) / n_persons
# Get observed-score variance of each item (Y_j)
var_y <- numeric(n_items)
for (j in 1:n_items) {
var_y[j] <- var(responses[, j]) * (n_persons - 1) / n_persons
}
# Apply the alpha formula
alpha <- (n_items / (n_items - 1)) * (1 - sum(var_y) / var_x)

alpha
}

coeff_alpha(resp)
## [1] 0.7435825

The obtained coefficient $$\alpha$$ from resp data set is 0.7436.

## 4.4 KR20

The Kuder-Richardson formula-20 (KR20) is a special case of the coefficient $$\alpha$$. If each item is dichotomous, coefficient $$\alpha$$ becomes the KR20.

$KR20 = \frac{N}{N-1} \left[{\frac{\sigma^2_X - \sum_{j=1}^N p_j(1-p_j)}{\sigma^2_X}}\right]$

where, $$p_j$$ is the proportion of correct answers ($$Y_j = 1$$) for item $$j$$.

• Note: If $$Y_j$$ is dichotomous, then $$\sigma^2_{Y_j} = p_j (1 - p_j)$$.

The below function returns the KR20:

KR20 <- function(responses) {
# Get number of items (N) and individuals
n_items <- ncol(responses)
n_persons <- nrow(responses)
# get p_j for each item
p <- colMeans(responses)
# Get total scores (X)
x <- rowSums(responses)
# observed score variance
var_x <- var(x) * (n_persons - 1) / n_persons
# Apply KR-20 formula
rel <- (n_items / (n_items - 1)) * (1 - sum(p * (1 - p)) / var_x)

rel
}

KR20(resp)
## [1] 0.7435825

Note that the sample variances are calculated by $$S^2_X = \frac{\sum_{i = 1}^{N} (X_i - \bar{X})^2}{n}$$, instead of $$S^2_X = \frac{\sum_{i = 1}^{N} (X_i - \bar{X})^2}{n - 1}$$.

## 4.5 KR21

The Kuder-Richardson formula-21 (KR21) is given by:

$KR21 = \frac{N}{N-1} \left[ {\frac{\sigma_X^2 - N\bar{p}(1-\bar{p})}{\sigma^2_X}} \right]$

where, $$\bar{p}$$ is the average of item difficulties.

In general, KR20 $$\geq$$ KR21. When the item difficulties are all equal, then KR20 = KR21.

The function below returns the KR21:

KR21 <- function(responses) {
# Get number of items (N) and individuals
n <- ncol(responses)
n_persons <- nrow(responses)
# Get overall item difficulty  - pbar
p_bar <- mean(colMeans(responses))
# Observed score variance
x <- rowSums(responses)
var_x <- var(x) * (n_persons - 1) / n_persons
# Apply KR-21 formula
rel <- (n / (n-1)) * (1 - (n * p_bar * (1 - p_bar)) / var_x)

rel
}
KR21(resp)
## [1] 0.721086

## 4.6 Standard error of measurement

Based on the reliability estimates, we can also estimate the standard error of measurement of the test, given by

$\sigma_E = \sigma_X \sqrt{1 - \rho_{XX'}}$ where, $$\sigma_X$$ is the standard deviation of the total score, and $$\rho_{XX'}$$ is the reliability.

The below function returns the standard error of measurement estimate with the reliability estimate of the user’s choice.

se_m <- function(responses, rel_est = "alpha") {
# The number of subjects
n_persons <- nrow(responses)
# Get the total score and its standard deviation
total <- rowSums(responses)
sd_x <- sqrt(var(total) * (n_persons - 1) / n_persons)

if (rel_est == "alpha") {
rel <- coeff_alpha(responses)
} else if (rel_est == "KR20") {
rel <- KR20(responses)
} else if (rel_est == "KR21") {
rel <- KR21(responses)
} else {
error('rel.est must be either alpha, KR20, or KR21')
}

se <- sd_x * sqrt(1 - rel)

se
}

se_m(responses = resp, rel_est = "KR20")
## [1] 2.928571
se_m(responses = resp, rel_est = "KR21")
## [1] 3.054337

### 4.6.1 Confidence interval for true score

The standard error of measurement estimate can be used to construct the confidence interval of true score based on observed score as:

$X \pm z_{\alpha / 2} \sigma_E = X \pm z_{\alpha / 2} \sigma_X \sqrt{1 - \rho_{XX'}}$

Suppose you want to obtain the 95% confidence intervals for true scores of the examinees using the reliability estimate based on coefficient $$\alpha$$. Complete the below code.

sem_alpha <-

total_score <-

z <-

ci_lower <- ci_upper <- numeric(100)

for (i in 1:100) {
ci_lower[i]
ci_upper[i]
}

hints:

• Use se_m() function to obtain $$\sigma_E$$.

• Use rowSums() function to obtain total scores $$X$$ of all examinees.

• The z-score corresponding to the 95% confidence interval, $$z_{\alpha / 2}$$, can be obtained by:

alpha <- 0.05
z <- qnorm(1 - (alpha / 2))
• At each iteration in the for loop, replace the $$i$$-th elements of ci_lower and ci_upper by $$X_i - z_{\alpha / 2} \sigma_E$$ and $$X_i + z_{\alpha / 2} \sigma_E$$.