Chapter 2 Dichotomous Rasch Models

2.1 Dichotomous Rasch Model

2.1.1 Definition

\[\ln_{}{}\left[\frac{\phi_{n i 1}}{\phi_{n i 0}}\right]=\theta_{n}-\delta_{i}\] The Rasch model predicts the probability of person n on item i providing a correct (x = 1), rather than incorrect (x = 0) response, given a person’s “ability” (θn) and an item’s difficulty (δi).

2.1.2 Basic Rasch Model & IRT Model Assumptions

  • Each person can be characterized by an ability (i.e., level), denoted as θn

  • Each item can be characterized by a difficulty, denoted as δi

  • Person ability and item difficulty can be expressed on one linear continuum (line)

  • The probability of observing any particular response can be calculated from the difference between ability and difficulty

2.2 R-Lab: Running Dichotomous Rasch Model in R

2.2.1 Understand the Data set

In this example,we will be working with data from a transitive reasoning test, which is a reasoning test related to relationships among physical objects. The transitive reasoning data were collected from a one-on-one interactive assessment in which an experimenter presented students with a set of objects, such as sticks, balls, cubes, and discs. The following description is given in Sijtsma and Molenaar (2002), pp. 31-32:

The items for transitive reasoning had the following structure. A typical item used three sticks, here denoted A, B, and C, of different length, denoted Y, such that YA < YB < YC. The actual test taking had the form of a conversation between experimenter and child in which the sticks were identified by their colors rather than letters. First, sticks A and B were presented to a child, who was allowed to pick them up and compare their lengths, for example, by placing them next to each other on a table.

Next, sticks B and C were presented and compared. Then all three sticks were displayed in a random order at large mutual distances so that their length differences were imperceptible, and the child was asked to infer the relation between sticks A and C from his or her knowledge of the relationship in the other two pairs.

The transitive reasoning items varied in terms of the property students were asked to reason about (length, weight, area). The tasks also varied in terms of the number of items students were asked to reason about, and whether the tasks involved equalities, inequalities, or a mixture of equalities and inequalities. The characteristics of the transitive reasoning data are summarized in the following table:

Task Property Format Objects Measures
1 Length YA > YB > YC Sticks 12, 11.5, 11 (cm)
2 Length YA = YB = YC = YD Tubes 12 (cm)
3 Weight YA > YB > YC Tubes 45, 25, 18 (g)
4 Weight YA = YB = YC = YD Cubes 65 (g)
5 Weight YA < YB < YC Balls 40, 50, 70 (g)
6 Area YA > YB> YC Discs 2.5, 7, 6.5 (diameter; cm)
7 Length YA > YB = YC Sticks 28.5, 27.5, 27.5 (cm)
8 Weight YA >YB = YC Balls 65, 40, 40 (g)
9 Length YA = YB = YC = YD Sticks 12.5, 12.5, 13, 13 (cm)
10 Weight YA = YB < YC = YD Balls 60, 60, 100, 100 (g)

2.2.2 Prepare the R Packages for analysis

In this session, we use the “TAM” package to do the Rasch Analysis. TAM is actually the abbreviation for “Test Analysis Modules”. Different from the Winstep software, it applies marginal maximum likelihood estimation (MMLE) instead of joint maximum likelihood estimation (JMLE). Note: Because of the different way of the estimation, the result might be slightly different with your Winstep outcomes.

# Load the R package that we need for this analysis
library("TAM") # For Dichotomous Rasch Analysis
library("WrightMap") # For plotting the variable map
library("Hmisc") # For descriptive data analysis
library("formattable") # For format number as percentage

2.2.3 Import the data & Running descriptive analysis

library(readr) # Import the data from your computer
transreas <- read_csv("transreas.csv")

Then,we run a descriptive analysis on our data.

# Use the summary() function to overview the data
summary(transreas)
##     Student        Grade          task_01          task_02      
##  Min.   :  1   Min.   :2.000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:107   1st Qu.:3.000   1st Qu.:1.0000   1st Qu.:1.0000  
##  Median :213   Median :4.000   Median :1.0000   Median :1.0000  
##  Mean   :213   Mean   :4.005   Mean   :0.9412   Mean   :0.8094  
##  3rd Qu.:319   3rd Qu.:5.000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :425   Max.   :6.000   Max.   :1.0000   Max.   :1.0000  
##     task_03          task_04          task_05          task_06      
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:1.0000   1st Qu.:1.0000   1st Qu.:1.0000   1st Qu.:1.0000  
##  Median :1.0000   Median :1.0000   Median :1.0000   Median :1.0000  
##  Mean   :0.8847   Mean   :0.7835   Mean   :0.8024   Mean   :0.9741  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.0000  
##     task_07          task_08          task_09          task_10    
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.00  
##  1st Qu.:1.0000   1st Qu.:1.0000   1st Qu.:0.0000   1st Qu.:0.00  
##  Median :1.0000   Median :1.0000   Median :0.0000   Median :1.00  
##  Mean   :0.8447   Mean   :0.9671   Mean   :0.3012   Mean   :0.52  
##  3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:1.00  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.00
# From the result, we can see there is no missing data for each variable. And we can also get a general idea on the range of the grades (from 2 to 6), Min, Max, Median for each task. 
# You can also use describe() function as an alternative approach
describe(transreas)
## transreas 
## 
##  12  Variables      425  Observations
## --------------------------------------------------------------------------------
## Student 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##      425        0      425        1      213      142     22.2     43.4 
##      .25      .50      .75      .90      .95 
##    107.0    213.0    319.0    382.6    403.8 
## 
## lowest :   1   2   3   4   5, highest: 421 422 423 424 425
## --------------------------------------------------------------------------------
## Grade 
##        n  missing distinct     Info     Mean      Gmd 
##      425        0        5     0.96    4.005    1.615 
## 
## lowest : 2 3 4 5 6, highest: 2 3 4 5 6
##                                         
## Value          2     3     4     5     6
## Frequency     86    85    82    85    87
## Proportion 0.202 0.200 0.193 0.200 0.205
## --------------------------------------------------------------------------------
## task_01 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##      425        0        2    0.166      400   0.9412    0.111 
## 
## --------------------------------------------------------------------------------
## task_02 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##      425        0        2    0.463      344   0.8094   0.3093 
## 
## --------------------------------------------------------------------------------
## task_03 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##      425        0        2    0.306      376   0.8847   0.2045 
## 
## --------------------------------------------------------------------------------
## task_04 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##      425        0        2    0.509      333   0.7835     0.34 
## 
## --------------------------------------------------------------------------------
## task_05 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##      425        0        2    0.476      341   0.8024   0.3179 
## 
## --------------------------------------------------------------------------------
## task_06 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##      425        0        2    0.076      414   0.9741  0.05054 
## 
## --------------------------------------------------------------------------------
## task_07 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##      425        0        2    0.394      359   0.8447    0.263 
## 
## --------------------------------------------------------------------------------
## task_08 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##      425        0        2    0.096      411   0.9671  0.06386 
## 
## --------------------------------------------------------------------------------
## task_09 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##      425        0        2    0.631      128   0.3012   0.4219 
## 
## --------------------------------------------------------------------------------
## task_10 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##      425        0        2    0.749      221     0.52   0.5004 
## 
## --------------------------------------------------------------------------------
# The Mean for each task in this table is the Proportion Correct statistic for each item (item difficulty estimate for Classical Test Theory)

2.2.4 Runing the Dichotomous Rasch Model with CML Method

To run the dichotomous Rasch Model using the TAM package, we only need students’ item-level responses. So the first and second column of our data are not needed.

# Trim the data
Di_Rasch_data <- transreas[,c(-1,-2)]
head(Di_Rasch_data) # Take a look
## # A tibble: 6 x 10
##   task_01 task_02 task_03 task_04 task_05 task_06 task_07 task_08 task_09
##     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
## 1       1       1       1       1       1       1       1       1       0
## 2       0       1       1       1       1       1       0       1       0
## 3       0       0       1       0       0       0       0       1       1
## 4       1       1       1       1       1       1       1       1       1
## 5       1       1       1       1       1       1       0       1       0
## 6       1       1       1       1       1       1       1       1       0
## # … with 1 more variable: task_10 <dbl>
# Running the Dichotomous Rasch Model
Di_Rasch_model <- tam(Di_Rasch_data)

2.2.5 Overall Model Summary

# Check the summary
summary(Di_Rasch_model)
## ------------------------------------------------------------
## TAM 3.5-19 (2020-05-05 22:45:39) 
## R version 3.6.2 (2019-12-12) x86_64, linux-gnu | nodename=bookdown | login=unknown 
## 
## Date of Analysis: 2020-11-02 02:37:08 
## Time difference of 0.121655 secs
## Computation time: 0.121655 
## 
## Multidimensional Item Response Model in TAM 
## 
## IRT Model: 1PL
## Call:
## tam.mml(resp = resp)
## 
## ------------------------------------------------------------
## Number of iterations = 26 
## Numeric integration with 21 integration points
## 
## Deviance = 3354.91 
## Log likelihood = -1677.46 
## Number of persons = 425 
## Number of persons used = 425 
## Number of items = 10 
## Number of estimated parameters = 11 
##     Item threshold parameters = 10 
##     Item slope parameters = 0 
##     Regression parameters = 0 
##     Variance/covariance parameters = 1 
## 
## AIC = 3377  | penalty=22    | AIC=-2*LL + 2*p 
## AIC3 = 3388  | penalty=33    | AIC3=-2*LL + 3*p 
## BIC = 3421  | penalty=66.57    | BIC=-2*LL + log(n)*p 
## aBIC = 3386  | penalty=31.56    | aBIC=-2*LL + log((n-2)/24)*p  (adjusted BIC) 
## CAIC = 3432  | penalty=77.57    | CAIC=-2*LL + [log(n)+1]*p  (consistent AIC) 
## AICc = 3378  | penalty=22.64    | AICc=-2*LL + 2*p + 2*p*(p+1)/(n-p-1)  (bias corrected AIC) 
## GHP = 0.39728     | GHP=( -LL + p ) / (#Persons * #Items)  (Gilula-Haberman log penalty) 
## 
## ------------------------------------------------------------
## EAP Reliability
## [1] 0.503
## ------------------------------------------------------------
## Covariances and Variances
##       [,1]
## [1,] 0.938
## ------------------------------------------------------------
## Correlations and Standard Deviations (in the diagonal)
##       [,1]
## [1,] 0.968
## ------------------------------------------------------------
## Regression Coefficients
##      [,1]
## [1,]    0
## ------------------------------------------------------------
## Item Parameters -A*Xsi
##       item   N     M xsi.item AXsi_.Cat1 B.Cat1.Dim1
## 1  task_01 425 0.941   -3.169     -3.169           1
## 2  task_02 425 0.809   -1.701     -1.701           1
## 3  task_03 425 0.885   -2.368     -2.368           1
## 4  task_04 425 0.784   -1.517     -1.517           1
## 5  task_05 425 0.802   -1.649     -1.649           1
## 6  task_06 425 0.974   -4.070     -4.070           1
## 7  task_07 425 0.845   -1.982     -1.982           1
## 8  task_08 425 0.967   -3.811     -3.811           1
## 9  task_09 425 0.301    1.002      1.002           1
## 10 task_10 425 0.520   -0.093     -0.093           1
## 
## Item Parameters in IRT parameterization
##       item alpha   beta
## 1  task_01     1 -3.169
## 2  task_02     1 -1.701
## 3  task_03     1 -2.368
## 4  task_04     1 -1.517
## 5  task_05     1 -1.649
## 6  task_06     1 -4.070
## 7  task_07     1 -1.982
## 8  task_08     1 -3.811
## 9  task_09     1  1.002
## 10 task_10     1 -0.093
# Plot the variable map 
IRT.WrightMap(Di_Rasch_model,show.thr.lab=FALSE)

2.2.6 Item Parameters

First of all, let’s pull out the item parameters from your model.

difficulty <- Di_Rasch_model$xsi
head(difficulty) 
##               xsi    se.xsi
## task_01 -3.168648 0.2133573
## task_02 -1.700894 0.1321453
## task_03 -2.367826 0.1600094
## task_04 -1.517158 0.1265292
## task_05 -1.649169 0.1304829
## task_06 -4.069890 0.3113083
mean(difficulty$xsi) # The mean difficulty of the tasks is -1.936
## [1] -1.935777
sd(difficulty$xsi) # The standard deviation for task difficulty is 1.56
## [1] 1.567723
mean(difficulty$se.xsi) # The mean Standard Error for task difficulty is 0.171.
## [1] 0.1714946
  • The ‘xsi’ column denotes the item difficulty in a logit scale; ‘se.xsi’ is the standard error for each item. The standard error is an estimate of the precision of the item difficulty estimates, where larger standard errors indicate less precise estimates.

  • Since the xsi indicates the item difficulty, higher values indicate more-difficult items (higher levels of the construct are required for a positive response). For instance, item 9 is the hardest item ( xsi = 1.00), whereas item 6 is the easiest item (xsi = -4.07).

We can also visualize the item difficulty using a simple histogram

hist(difficulty$xsi,breaks=10) 

Now let’s calculate the Item fit statistics

Di_Item_fit <- tam.fit(Di_Rasch_model) 
## Item fit calculation based on 40 simulations
## |**********|
## |----------|
head(Di_Item_fit)
## $itemfit
##    parameter    Outfit   Outfit_t    Outfit_p Outfit_pholm     Infit    Infit_t
## 1    task_01 0.8032809 -1.2131743 0.225063204   0.90025282 0.9348243 -0.3369912
## 2    task_02 1.1595568  1.9648148 0.049435676   0.32369326 1.0888328  1.1504316
## 3    task_03 0.7945699 -1.9931786 0.046241894   0.32369326 0.9165093 -0.7478803
## 4    task_04 1.2428057  3.2805783 0.001035945   0.01035945 1.1239968  1.7546662
## 5    task_05 1.0579885  0.7710351 0.440686125   0.90025282 1.0291322  0.4075138
## 6    task_06 0.4081023 -2.7413441 0.006118839   0.05506955 0.8974970 -0.2942547
## 7    task_07 0.8782256 -1.4404992 0.149726214   0.74863107 0.9447391 -0.6062443
## 8    task_08 0.5630576 -2.1240796 0.033663489   0.26930791 0.9243890 -0.2458878
## 9    task_09 1.0573895  1.0770925 0.281438925   0.90025282 1.0081116  0.1572935
## 10   task_10 0.9787992 -0.5712978 0.567797774   0.90025282 0.9896214 -0.2732941
##       Infit_p Infit_pholm
## 1  0.73612357   1.0000000
## 2  0.24996614   1.0000000
## 3  0.45453233   1.0000000
## 4  0.07931642   0.7931642
## 5  0.68363065   1.0000000
## 6  0.76856330   1.0000000
## 7  0.54435257   1.0000000
## 8  0.80576909   1.0000000
## 9  0.87501358   1.0000000
## 10 0.78462713   1.0000000
## 
## $time
## [1] "2020-11-02 02:37:08 UTC" "2020-11-02 02:37:08 UTC"
## 
## $CALL
## tam.fit(tamobj = Di_Rasch_model)

2.2.7 Person Parameters

Get the person ability by using tam.wle function

Person_ability <- tam.wle(Di_Rasch_model)
## Iteration in WLE/MLE estimation  1   | Maximal change  2.783 
## Iteration in WLE/MLE estimation  2   | Maximal change  0.8352 
## Iteration in WLE/MLE estimation  3   | Maximal change  0.0976 
## Iteration in WLE/MLE estimation  4   | Maximal change  0.0058 
## Iteration in WLE/MLE estimation  5   | Maximal change  4e-04 
## Iteration in WLE/MLE estimation  6   | Maximal change  0 
## ----
##  WLE Reliability= 0.308
# View(Person_ability)
# In the data frame above, the `theta` is the person's ability measure on a logit scale.
mean(Person_ability$theta) # The average ability for test-taker is -0.056.
## [1] -0.05565378
sd(Person_ability$theta) # The standard deviation for test-taker ability measures is 1.28.
## [1] 1.281021
mean(Person_ability$error) # The mean Standard Error for test-taker ability measures is 1.023.
## [1] 1.022701

Visualize the Person ability measures using a simple histogram

hist(Person_ability$theta)

Calculate the Person fit statistics

Di_Person_fit <- tam.personfit(Di_Rasch_model) 
head(Di_Person_fit)
##   outfitPerson outfitPerson_t infitPerson infitPerson_t
## 1   0.23116561     -0.6747306   0.3612124    -1.3926697
## 2   1.06501332      0.3075433   1.1169890     0.4450260
## 3   4.21519230      2.8424540   2.0818362     2.5767465
## 4   0.04344653      0.8893776   0.1568586    -0.5789409
## 5   0.54777995     -0.4591824   0.6968196    -0.6503183
## 6   0.16335767     -0.1230185   0.3844617    -0.9719057

2.2.8 Summarize the results in a table

For Item Calibration Table

# Set up the contents for table2
Table2 <- data.frame()
Table2 <- setNames(data.frame(matrix(ncol = 8, nrow = 10)), c("TaskID", "PropCorrect", "Delta","SE","Outfit","Outfit_P","Infit","Infit_P"))
# Calculate the proportion correct (you can also type in these values by hand from the previous outcome)
TaskCorrect <- apply(Di_Rasch_data, 2, sum)
PropCorrect <- percent(TaskCorrect/425)
Table2$TaskID <- 1:10
Table2$PropCorrect <- PropCorrect
Table2$Delta <- difficulty$xsi
Table2$SE <- difficulty$se.xsi
Table2$Outfit <- Di_Item_fit[["itemfit"]][["Outfit"]]
Table2$Outfit_P <- Di_Item_fit[["itemfit"]][["Outfit_p"]]
Table2$Infit <- Di_Item_fit[["itemfit"]][["Infit"]]
Table2$Infit_P <- Di_Item_fit[["itemfit"]][["Infit_p"]]
# Sort the table 2 by Item difficulty
Table2 <- Table2[order(-PropCorrect),]

For Person calibration table

# Set up the contents for table3
Table3 <- data.frame()
Table3 <- setNames(data.frame(matrix(ncol = 8, nrow = 425)), c("TestTakerID", "PropCorrect", "Theta","SE","Outfit","Outfit_t","Infit","Infit_t"))
# Calculate the Proportion Correct
Person_Score <- rowSums(Di_Rasch_data, na.rm=FALSE) 
Person_PropCorrect <- Person_Score/10
Table3$TestTakerID <- 1:425
Table3$PropCorrect <- Person_PropCorrect
Table3$Theta <- Person_ability$theta
Table3$SE <- Person_ability$error
Table3$Outfit <- Di_Person_fit$outfitPerson
Table3$Outfit_t <- Di_Person_fit$outfitPerson_t
Table3$Infit <- Di_Person_fit$infitPerson
Table3$Infit_t <- Di_Person_fit$infitPerson_t 
# Note here the TAM package only report t value instead of p value. However, you still can calculate that by yourself if you need it.

2.3 Runing the Dichotomous Rasch Model with JML Method

I just noticed that we can also use the tam.jml() function to run the Dichotomous Rasch Model using the JML method, which is the same estimation method with the Winstep software. Most of your code will be the same with the previous example. This section will give to an example to run the Dichotomous Rasch Model with JML Method.

# Running the Dichotomous Rasch Model use tam.jml() function
Di_Rasch_model_jml <- tam.jml(Di_Rasch_data)

2.3.1 Overall Model Summary

# Check the summary
summary(Di_Rasch_model_jml)
## ------------------------------------------------------------
## TAM 3.5-19 (2020-05-05 22:45:39) 
## R version 3.6.2 (2019-12-12) x86_64, linux-gnu | nodename=bookdown | login=unknown 
## 
## Start of Analysis: 2020-11-02 02:37:08 
## End of Analysis: 2020-11-02 02:37:08 
## Time difference of 0.03071404 secs
## Computation time: 0.03071404 
## 
## Joint Maximum Likelihood Estimation in TAM 
## 
## IRT Model
## Call:
## tam.jml(resp = Di_Rasch_data)
## 
## ------------------------------------------------------------
## Number of iterations = 10 
## 
## Deviance = 2607.22  | Log Likelihood = -1303.61 
## Number of persons = 425 
## Number of items = 10 
## constraint = cases 
## bias = TRUE 
## ------------------------------------------------------------
## Person Parameters xsi
## M = 0 
## SD = 1.47 
## ------------------------------------------------------------
## Item Parameters xsi
##    xsi.label xsi.index    xsi se.xsi
## 1    task_01         1 -3.234  0.224
## 2    task_02         2 -1.767  0.140
## 3    task_03         3 -2.435  0.169
## 4    task_04         4 -1.581  0.134
## 5    task_05         5 -1.714  0.138
## 6    task_06         6 -4.128  0.326
## 7    task_07         7 -2.049  0.150
## 8    task_08         8 -3.873  0.292
## 9    task_09         9  1.046  0.125
## 10   task_10        10 -0.118  0.115
## ------------------------------------------------------------
## Item Parameters -A*Xsi
##       item   N     M xsi.item AXsi_.Cat1 B.Cat1.Dim1
## 1  task_01 425 0.941   -3.234     -3.234           1
## 2  task_02 425 0.809   -1.767     -1.767           1
## 3  task_03 425 0.885   -2.435     -2.435           1
## 4  task_04 425 0.784   -1.581     -1.581           1
## 5  task_05 425 0.802   -1.714     -1.714           1
## 6  task_06 425 0.974   -4.128     -4.128           1
## 7  task_07 425 0.845   -2.049     -2.049           1
## 8  task_08 425 0.967   -3.873     -3.873           1
## 9  task_09 425 0.301    1.046      1.046           1
## 10 task_10 425 0.520   -0.118     -0.118           1
# Plot the person-item map
difficulty <- Di_Rasch_model_jml$xsi
wrightMap(Di_Rasch_data,difficulty)

##             [,1]
##  [1,] -3.2344237
##  [2,] -1.7665739
##  [3,] -2.4352438
##  [4,] -1.5813350
##  [5,] -1.7144842
##  [6,] -4.1281759
##  [7,] -2.0494100
##  [8,] -3.8725301
##  [9,]  1.0461546
## [10,] -0.1178787

2.3.2 Item Parameters

First of all, let’s pull out the item parameters from your model.

difficulty <- Di_Rasch_model_jml$xsi
head(difficulty) 
## [1] -3.234424 -1.766574 -2.435244 -1.581335 -1.714484 -4.128176
mean(difficulty) # The mean difficulty of the tasks is -1.985
## [1] -1.98539
sd(difficulty) # The standard deviation for task difficulty is 1.595
## [1] 1.594497
  • The ‘xsi’ column denotes the item difficulty in a logit scale; ‘se.xsi’ is the standard error for each item. The standard error is an estimate of the precision of the item difficulty estimates, where larger standard errors indicate less precise estimates.

  • Since the xsi indicates the item difficulty, higher values indicate more-difficult items (higher levels of the construct are required for a positive response). For instance, item 9 is the hardest item ( xsi = 1.00), whereas item 6 is the easiest item (xsi = -4.07).

We can also visualize the item difficulty using a simple histogram

hist(difficulty) 

Now let’s calculate the Item fit statistics

Di_fit <- tam.jml.fit(Di_Rasch_model_jml) 
head(Di_fit$fit.item)
##            item outfitItem outfitItem_t infitItem infitItem_t
## task_01 task_01  0.4420089   -1.6559613 0.6950708  -2.3161703
## task_02 task_02  1.0221315    0.1803034 1.0758295   1.0613255
## task_03 task_03  0.4834787   -2.3565396 0.7056770  -3.3346910
## task_04 task_04  1.1717216    1.0243575 1.1555052   2.2693943
## task_05 task_05  0.8205063   -1.0015131 0.9787715  -0.2834854
## task_06 task_06  0.1317884   -2.1305158 0.5047768  -2.6925253

2.3.3 Person Parameters and fit statistics

head(Di_fit$fit.person)
##   outfitPerson outfitPerson_t infitPerson infitPerson_t
## 1   0.22639053     -0.6121284   0.3707819    -1.2890786
## 2   1.01998416      0.2353898   1.1223551     0.4674451
## 3   5.16762942      2.9011650   2.0875413     2.4600841
## 4   0.03242973      1.2237027   0.1268986    -0.4169056
## 5   0.53999818     -0.4830984   0.6931769    -0.6631694
## 6   0.16217148      0.1160049   0.4698892    -0.6485870
# View Person Scores and Person ability, which is the "theta"
# In the data frame above, the `theta` is the person's ability measure on a logit scale.
mean(Di_Rasch_model_jml$theta) # The average ability for test-taker is almost zero.
## [1] -4.090165e-17
sd(Di_Rasch_model_jml$theta) # The standard deviation for test-taker ability measures is 1.472.
## [1] 1.472164
mean(Di_Rasch_model_jml[["errorWLE"]]) # The mean Standard Error for test-taker ability measures is 1.025.
## [1] 1.025197

Visualize the Person ability measures using a simple histogram

hist(Di_Rasch_model_jml$theta)

2.3.4 Compare the results of estimated parameters between JML and MML method

plot(Di_Rasch_model_jml$xsi, Di_Rasch_model$xsi$xsi, pch=16,
xlab=expression(paste(xi[i], "(JML)")),
ylab=expression(paste(xi[i], "(MML)")),
main="Item Parameter Estimate Comparison")
lines(c(-5,5), c(-5,5), col="gray" )

2.4 Example APA-Style Results Write-Up (Transitive Reasoning Test)

Table 1 presents a summary of the results from the analysis of the transitive reasoning data Sijtsma and Molenaar,2002 using the dichotomous Rasch model (Rasch, 1960). Specifically, the calibration of test participants (N = 425) and Tasks (N = 10) are summarized using average logit-scale calibrations, standard errors, and model-data fit statistics. Examination of the results indicates that, on average, the task takers were located higher on the logit scale (M = -0.056,SD = 1.281), compared to Tasks (M = -1.936, SD = 1.281). This finding suggests that the items were relatively easy for the sample of kids who participated in this transitive reasoning test. However, average values of the Standard Error (SE) are slightly higher for Kids (M = 1.023) than Tasks (M = 0.17), indicating that there may be some issues related to targeting for some of the Kids who participated in the assessment. Average values of model-data fit statistics indicate overall adequate fit to the model, with average Infit and Outfit mean square statistics around 1.00, [and average standardized Infit and Outfit statistics near the expected value of 0.00 when data fit the model.] This sentence needs rephrase. This finding of adequate fit to the model supports the interpretation of item and person calibrations on the logit scale as indicators of their locations on the latent variable measured by the test.

# Print the table2 in a neat way
knitr::kable(
  Table2[,-1], booktabs = TRUE,
  caption = 'Item Calibration'
)
Table 2.1: Item Calibration
PropCorrect Delta SE Outfit Outfit_P Infit Infit_P
6 97.41% -4.0698905 0.3113083 0.4081023 0.0061188 0.8974970 0.7685633
8 96.71% -3.8109225 0.2780165 0.5630576 0.0336635 0.9243890 0.8057691
1 94.12% -3.1686478 0.2133573 0.8032809 0.2250632 0.9348243 0.7361236
3 88.47% -2.3678260 0.1600094 0.7945699 0.0462419 0.9165093 0.4545323
7 84.47% -1.9824195 0.1423829 0.8782256 0.1497262 0.9447391 0.5443526
2 80.94% -1.7008935 0.1321453 1.1595568 0.0494357 1.0888328 0.2499661
5 80.24% -1.6491690 0.1304829 1.0579885 0.4406861 1.0291322 0.6836306
4 78.35% -1.5171582 0.1265292 1.2428057 0.0010359 1.1239968 0.0793164
10 52.00% -0.0932481 0.1061752 0.9787992 0.5677978 0.9896214 0.7846271
9 30.12% 1.0024016 0.1145384 1.0573895 0.2814389 1.0081116 0.8750136

Table 2.1 includes detailed results for the 10 Task items included in the Transitive Reasoning test. For each item, the proportion of correct responses is presented, followed by the logit-scale calibration (δ), SE, and model-data fit statistics. Examination of these results indicates that Task 9 was the most difficult (Proportion Correct = 30.12%; δ = 1.00 ; SE = .11), followed by Task 10 (Proportion Correct = 52%; δ = -.09; SE = 0.11). The easiest item was Task 6(Proportion Correct = 97.41%; δ = -4.07; SE = 0.31).

# Print the table3 in a neat way
knitr::kable(
  head(Table3,10), booktabs = TRUE,
  caption = 'Person Calibration'
)
Table 2.2: Person Calibration
TestTakerID PropCorrect Theta SE Outfit Outfit_t Infit Infit_t
1 0.8 -0.1632616 0.9172265 0.2311656 -0.6747306 0.3612124 -1.3926697
2 0.6 -1.4663564 0.7688974 1.0650133 0.3075433 1.1169890 0.4450260
3 0.4 -2.5351672 0.7587624 4.2151923 2.8424540 2.0818362 2.5767465
4 1.0 2.3471248 1.7628252 0.0434465 0.8893776 0.1568586 -0.5789409
5 0.7 -0.8772971 0.8190927 0.5477800 -0.4591824 0.6968196 -0.6503183
6 0.9 0.7916991 1.1106587 0.1633577 -0.1230185 0.3844617 -0.9719057
7 1.0 2.3471248 1.7628252 0.0434465 0.8893776 0.1568586 -0.5789409
8 0.7 -0.8772971 0.8190927 0.7003565 -0.1849399 0.7839575 -0.4019699
9 0.9 0.7916991 1.1106587 0.1633577 -0.1230185 0.3844617 -0.9719057
10 0.7 -0.8772971 0.8190927 0.4158228 -0.7412154 0.5673364 -1.0609277

Table 3 includes detailed results for first 10 test takers who participated in the Transitive Reasoning Test. For each participant, the proportion of correct responses is presented, followed by their logit-scale measure (θ), SE, and model-data fit statistics. Examination of these results indicates that around 51 participants has the highest score (Proportion Correct = 100%; θ = 2.347; SE = 1.762). The lowest score test taker was ID.148 (Proportion Correct = 10%; θ = -4.52; SE = 1.03).

# Plot the variable-Map
IRT.WrightMap(Di_Rasch_model,show.thr.lab=FALSE)

Figure 1 illustrates the calibrations of the Participants and Items on the logit scale that represents the latent variable. The calibrations shown in this figure correspond to the calibrations presented in Table 2 and Table 3 for items and persons, respectively. The rightmost column (Measure) shows the logit scale. Higher numbers correspond to higher levels of achievement (for persons) and higher levels of difficulty (for items), and lower numbers correspond to lower achievement and less difficulty, respectively, for persons and items. Next, Respondents on the latent variable are illustrated using the histogram. Examination of the histogram indicates a wide spread of achievement levels, with most students grouped near the middle of the logit scale (θ = 0.00). Next, Task locations on the logit scale are plotted on the right side. Examination of the Tasks plotting indicates a similar overall spread as the participants measures. However, the Tasks appear somewhat clustered at the lower half of the logit scale, without many items appearing above average (θ >= 0.00). This lack of moderate-difficulty items may have contributed to the somewhat large SE values for students with middle-range calibrations.

2.5 Exericise

Use the simulated data to run Dichotomous Rasch Model using TAM package.

[The Data could be either attached to this site or Blackboard]