Reliability

  • In this chapter, three methods for estimating internal reliability (ρXX) are covered: coefficient α, KR20, and KR21.

  • The standard error of measurement estimates of a test can be obtained based on the reliability estimates.

  • In addition, the standard error of measurement estimates can be used for obtaining the confidence intervals for true scores of the subjects.

4.3 Coefficient α

The coefficient α provides the lower bound of the reliability of a test.

ρXXα=NN1[σ2XNj=1σ2Yjσ2X] where,

  • N: the number of subtests (usually, items).

  • σ2X: the total score variance of the whole test.

  • σ2Yj: the variance of the observed scores on the j-th subtest.

In practice, we often treat each item as a subtest, so N is the total number of items, and σ2Yj are the score variances of the individual items.

Let’s create a function that calculates the coefficient α:

coeff_alpha <- function(responses) {
  # Get number of items (N) and individuals
  n_items <- ncol(responses)
  n_persons <- nrow(responses)
  # Get individual total scores
  x <- rowSums(responses)
  # Get observed-score variance of whole test (X)
  var_x <- var(x) * (n_persons - 1) / n_persons
  # Get observed-score variance of each item (Y_j)
  var_y <- numeric(n_items)
  for (j in 1:n_items) {
    var_y[j] <- var(responses[, j]) * (n_persons - 1) / n_persons
  }
  # Apply the alpha formula
  alpha <- (n_items / (n_items - 1)) * (1 - sum(var_y) / var_x)
  
  alpha
}

coeff_alpha(resp)
## [1] 0.7435825

The obtained coefficient α from resp data set is 0.7436.

4.4 KR20

The Kuder-Richardson formula-20 (KR20) is a special case of the coefficient α. If each item is dichotomous, coefficient α becomes the KR20.

KR20=NN1[σ2XNj=1pj(1pj)σ2X]

where, pj is the proportion of correct answers (Yj=1) for item j.

  • Note: If Yj is dichotomous, then σ2Yj=pj(1pj).

The below function returns the KR20:

KR20 <- function(responses) {
  # Get number of items (N) and individuals
  n_items <- ncol(responses)
  n_persons <- nrow(responses)
  # get p_j for each item
  p <- colMeans(responses)
  # Get total scores (X)
  x <- rowSums(responses)
  # observed score variance
  var_x <- var(x) * (n_persons - 1) / n_persons
  # Apply KR-20 formula
  rel <- (n_items / (n_items - 1)) * (1 - sum(p * (1 - p)) / var_x)
  
  rel
}

KR20(resp)
## [1] 0.7435825

Note that the sample variances are calculated by S2X=Ni=1(XiˉX)2n, instead of S2X=Ni=1(XiˉX)2n1.

4.5 KR21

The Kuder-Richardson formula-21 (KR21) is given by:

KR21=NN1[σ2XNˉp(1ˉp)σ2X]

where, ˉp is the average of item difficulties.

In general, KR20 KR21. When the item difficulties are all equal, then KR20 = KR21.

The function below returns the KR21:

KR21 <- function(responses) {
  # Get number of items (N) and individuals
  n <- ncol(responses)
  n_persons <- nrow(responses)
  # Get overall item difficulty  - pbar
  p_bar <- mean(colMeans(responses))
  # Observed score variance
  x <- rowSums(responses)
  var_x <- var(x) * (n_persons - 1) / n_persons
  # Apply KR-21 formula
  rel <- (n / (n-1)) * (1 - (n * p_bar * (1 - p_bar)) / var_x)
  
  rel
}
KR21(resp)
## [1] 0.721086

4.6 Standard error of measurement

Based on the reliability estimates, we can also estimate the standard error of measurement of the test, given by

σE=σX1ρXX where, σX is the standard deviation of the total score, and ρXX is the reliability.

The below function returns the standard error of measurement estimate with the reliability estimate of the user’s choice.

se_m <- function(responses, rel_est = "alpha") {
  # The number of subjects
  n_persons <- nrow(responses)
  # Get the total score and its standard deviation
  total <- rowSums(responses)
  sd_x <- sqrt(var(total) * (n_persons - 1) / n_persons)
  
  if (rel_est == "alpha") {
    rel <- coeff_alpha(responses)
  } else if (rel_est == "KR20") {
    rel <- KR20(responses)
  } else if (rel_est == "KR21") {
    rel <- KR21(responses)
  } else {
    error('rel.est must be either alpha, KR20, or KR21')
  }
  
  se <- sd_x * sqrt(1 - rel)
  
  se
}

se_m(responses = resp, rel_est = "KR20")
## [1] 2.928571
se_m(responses = resp, rel_est = "KR21")
## [1] 3.054337

4.6.1 Confidence interval for true score

The standard error of measurement estimate can be used to construct the confidence interval of true score based on observed score as:

X±zα/2σE=X±zα/2σX1ρXX

4.7 Your turn

Suppose you want to obtain the 95% confidence intervals for true scores of the examinees using the reliability estimate based on coefficient α. Complete the below code.

sem_alpha <- 

total_score <- 

z <- 

ci_lower <- ci_upper <- numeric(100)

for (i in 1:100) {
  ci_lower[i]
  ci_upper[i]
}

hints:

  • Use se_m() function to obtain σE.

  • Use rowSums() function to obtain total scores X of all examinees.

  • The z-score corresponding to the 95% confidence interval, zα/2, can be obtained by:

alpha <- 0.05
z <- qnorm(1 - (alpha / 2))
  • At each iteration in the for loop, replace the i-th elements of ci_lower and ci_upper by Xizα/2σE and Xi+zα/2σE.