# Chapter 8 Rating Scale Evaluation

## 8.1 R-lab purpose

The purpose of this lab is to give students experience running some basic rating scale analyses using Rasch models. From Blackboard, please download and save these files to a convenient location on your computer.

ratings.csv pain_cleaned_numeric.csv

We will use “eRm” package for this analysis.

## Install & load the eRm package:
# install.packages("eRm")  # Install the pacakage if you do not have it
library("eRm")

## 8.2 R-lab: Rating Scale Analysis on Example Survey Data

For our first analysis, we will try conducting basic rating scale functioning analyses using some example survey data.

### 8.2.1 Load and investigate the data set

# load the example survey data into your working environment and generate a summary of the data
summary(survey.responses)
##        id               i1             i2              i3
##  Min.   :  1.00   Min.   :0.00   Min.   :0.000   Min.   :0.000
##  1st Qu.: 88.25   1st Qu.:1.00   1st Qu.:1.000   1st Qu.:2.000
##  Median :175.50   Median :2.00   Median :2.000   Median :3.000
##  Mean   :175.50   Mean   :1.86   Mean   :1.654   Mean   :2.934
##  3rd Qu.:262.75   3rd Qu.:3.00   3rd Qu.:3.000   3rd Qu.:4.000
##  Max.   :350.00   Max.   :4.00   Max.   :4.000   Max.   :4.000
##        i4              i5              i6              i7
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000
##  1st Qu.:1.000   1st Qu.:1.000   1st Qu.:2.000   1st Qu.:1.000
##  Median :2.000   Median :2.000   Median :3.000   Median :2.000
##  Mean   :1.937   Mean   :2.114   Mean   :2.543   Mean   :2.046
##  3rd Qu.:3.000   3rd Qu.:3.000   3rd Qu.:3.000   3rd Qu.:3.000
##  Max.   :4.000   Max.   :4.000   Max.   :4.000   Max.   :4.000
##        i8              i9             i10             i11
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000
##  1st Qu.:2.000   1st Qu.:1.000   1st Qu.:2.000   1st Qu.:1.000
##  Median :3.000   Median :2.000   Median :2.000   Median :2.000
##  Mean   :2.851   Mean   :2.046   Mean   :2.311   Mean   :2.277
##  3rd Qu.:4.000   3rd Qu.:3.000   3rd Qu.:3.000   3rd Qu.:3.000
##  Max.   :4.000   Max.   :4.000   Max.   :4.000   Max.   :4.000
##       i12             i13             i14             i15
##  Min.   :0.000   Min.   :0.000   Min.   :0.000   Min.   :0.000
##  1st Qu.:1.000   1st Qu.:2.000   1st Qu.:2.000   1st Qu.:1.000
##  Median :2.000   Median :3.000   Median :3.000   Median :2.000
##  Mean   :1.809   Mean   :2.729   Mean   :2.463   Mean   :1.711
##  3rd Qu.:3.000   3rd Qu.:4.000   3rd Qu.:3.000   3rd Qu.:3.000
##  Max.   :4.000   Max.   :4.000   Max.   :4.000   Max.   :4.000
• Examine the summary of the data to see if there are any missing values or other potentially problematic things we should know about before analyzing the data

Then, let’s trim the data. Remove the “id” variable from the survey data because it’s not part of the item responses.

survey.responses <- survey.responses[,-1]
# Use the apply function on table() function to check the frequencies of item responses in each category
apply(survey.responses,2,table)
##    i1 i2  i3  i4 i5 i6  i7  i8  i9 i10 i11 i12 i13 i14 i15
## 0  49 71   8  49 32 12  32   9  35  28  27  66   9  22  60
## 1  92 95  18  75 86 56  88  39  88  57  66  85  41  52  92
## 2 104 91  91 112 98 99 102  65 100 106  96  81  93  90 107
## 3  69 70 105  77 78 96  88 119  80  96 105  86 100 114  71
## 4  36 23 128  37 56 87  40 118  47  63  56  32 107  72  20
• From these results, look to see if there are any categories with very few responses such that recoding might be warranted.

### 8.2.2 Run the Partial Credit Model

# Run the Partial Credit model on the item responses
PC_model <- PCM(survey.responses)
# After above code runs, the results are stored in an object called “PC_model”

### 8.2.3 Check Categories for Monotonicity

Check category threshold estimates for non-decreasing (i.e., monotonic) ordering over increasing categories.

# This is a function that finds the threshold values specific to each item and plots them.
responses <- survey.responses # update this if needed with the responses object
model.results <- PC_model # update this if needed with the model results object

for(item.number in 1:ncol(responses)){

n.thresholds <-  length(table(responses[, item.number]))-1

taus <- (thresholds(model.results))

taus.item <- c(taus$threshpar[((item.number*n.thresholds)-(n.thresholds-1)): (item.number*n.thresholds)])*-1 taus.item <- as.vector(unlist(taus.item))*-1 plot(c(1:length(taus.item)), taus.item, ylab = "Threshold Location", xlab = "Threshold Number", main = paste("Item ", item.number, "Threshold Locations"), axes = FALSE, ylim = c(floor(min(taus.item)), ceiling(max(taus.item)))) axis(1, at = c(1:length(taus.item))) axis(2, at = c(floor(min(taus.item)): ceiling(max(taus.item))), labels = c(floor(min(taus.item)): ceiling(max(taus.item)))) lines(taus.item) } • Running the above code will generate plots for each item that show the estimated location for each threshold. • Use the arrow buttons in the Plots window to navigate through the plots and examine the plots for evidence of non-decreasing threshold locations on the y-axis as the thresholds progress along the x-axis. • You can also check the distance between the threshold locations for evidence of distinct categories. You can export the item and threshold location estimates for use in reporting ### Export overall item & threshold estimates: item.estimates <- as.data.frame(taus$threshtable)

item.estimates <- round(item.estimates, digits = 2)

write.csv(item.estimates, file = "item_thresholds_PC_Model.csv")

### 8.2.4 Examine Category Probability curves

Examine category probability curves for evidence of distinct and ordered categories.

# The function below that finds the threshold values specific to each item and plots them.
responses <- survey.responses # update to data object if using with other dataset!
model.results <- PC_model # update to model object if using with other dataset!
for(item.number in 1:ncol(survey.responses)){
plotICC(model.results, item.subset = item.number)
}

The code above will produce category probability plots in the Plots window.

• Use the arrow buttons in the Plots window to navigate through the plots and examine the plots for evidence of:
• Ordered category curves
• Distinct (modal) categories

## 8.3 Example APA-format Results section for rating scale analysis

We analyzed 350 participants’ responses to a 15-item survey that included items with five ordered categories ($$X$$ = 0, 1, 2, 3, or 4) for evidence of useful rating scale category functioning. Specifically, following Linacre (2002) and Engelhard and Wind (2018), we examined the survey item responses for evidence of directional orientation with the latent variable and distinct categories on the construct. So that we could consider category functioning specific to each item with the most diagnostic utility, we used the Partial Credit Model (Masters, 1982) to analyze the data. We conducted the analyses using the Extended Rasch Models (“eRm”) package (Mair, Hatzinger, & Maier, 2020) for R.

Participant responses to all 15 items included observations in all five rating scale categories. All of the items included at least 10 responses in each category except for the lowest rating scale category ($$X$$ = 0) for Item 3 (frequency = 8), Item 8 (frequency = 9), and Item 13 (frequency = 9). In a future administration of this survey, researchers may consider reducing the scale length for these items to avoid unstable estimates.

First, we examined the estimated threshold locations for each item for evidence of directional orientation with the latent variable (i.e., non-decreasing locations as categories increase). Because all of the items included responses in five categories, there were four threshold estimates for each item (τi1, τi2, τi3, τi4). Accordingly, we examined whether τi1 ≤ τi2, τi2 ≤ τi3, and τi3 ≤ τi4 for each item. Table 1 includes the threshold estimates for each item, along with the overall item location estimate. For Item 3, the first and second threshold were disordered (τ3,1 = -1.46, τ3,2 = -1.68; see Figure 1). For all of the other items, the thresholds were ordered as expected given the ordinal rating scale categories.

Next, we examined category probability curves for evidence that the rating scale categories were ordered as expected (non-decreasing threshold locations over increasing categories) and distinct for each item. Distinct categories imply that each rating scale category describes a meaningful range of locations on the logit scale that represents the construct. The categories appeared to function as expected for all of the items except Item 3 (see Figure 2). For this item, the first and second rating scale categories were disordered.

## 8.4 References

Andrich, D. A. (1978). A rating formulation for ordered response categories. Psychometrika, 43(4), 561–573. https://doi.org/10.1007/BF02293814

Engelhard, G., & Wind, S. A. (2018). Invariant measurement with raters and rating scales: Rasch models for rater-mediated assessments. Taylor & Francis.

Linacre, J. M. (2002). Optimizing rating scale category effectiveness. Journal of Applied Measurement, 3(1), 85–106.

Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149–174. https://doi.org/10.1007/BF02296272

Mair, P., Hatzinger, R., & Maier M. J. (2020). eRm: Extended Rasch Modeling. 1.0-1. https://cran.r-project.org/package=eRm