Chapter 5 Optional: Understanding the model

TAM also provides some descriptive statistics.

item_prop <- mod1$item

item_prop

	item	N	M	xsi.item	AXsi_.Cat1	B.Cat1.Dim1
V1	V1	1000	0.182	1.7930554	1.7930554	1
V2	V2	1000	0.074	2.9361446	2.9361446	1
V3	V3	1000	0.175	1.8479678	1.8479678	1
V4	V4	1000	0.164	1.9375212	1.9375212	1
V5	V5	1000	0.280	1.1391729	1.1391729	1
V6	V6	1000	0.566	-0.3249806	-0.3249806	1
V7	V7	1000	0.440	0.2916454	0.2916454	1
V8	V8	1000	0.479	0.1004197	0.1004197	1
V9	V9	1000	0.435	0.3163527	0.3163527	1
V10	V10	1000	0.915	-2.7690302	-2.7690302	1
V11	V11	1000	0.123	2.3170294	2.3170294	1
V12	V12	1000	0.760	-1.3863445	-1.3863445	1
V13	V13	1000	0.936	-3.1003226	-3.1003226	1
V14	V14	1000	0.612	-0.5554452	-0.5554452	1
V15	V15	1000	0.541	-0.2020052	-0.2020052	1

Note, the total number of people who answered an item correctly is a sufficient statistic for calculating an item’s difficulty. Said another way, the number of correct answers, or, number of people who endorse a category increases monotonically with the item difficulty (of course, this does not mean you can just replace the Rasch model with a sum score since we’re using the Rasch model to test whether summing items at all is a reasonable thing to do).

To see this, we can find the total number of people who endorsed the “agree” category for each Hls item above. The table provides the proportion who endorsed the higher category in the M column. For instance, item Hls1 had 15.77% of people endorse the “agree” category (1= agree, 0= disagree). In the N column, we see that 317 people answered the item in total.

That means that $317*.1577$ = 50 people answering the item correctly. Note, the estimated difficulty found in the column is 2.43 logits.

# Confirm that the total number of endorsements (coded 1) is 50 for Hls1: sum down the column containing all answers to Hls1 in the raw data.


apply(hls[1], 2, sum)

##  V1 
## 182

However, we see that for item Hls5, 27% of people endorsed that item and the estimated mean item difficulty in xsi.item is 1.50 logits.

The correlation between total number of endorsements per item and the estimated item difficulty can be computed as follows.

# create a column in the item_prop object that has the total number of endorsements for each item
item_prop <- mutate(item_prop, total_endorsed =N*M)

cor(item_prop$xsi.item, item_prop$total_endorsed)

## [1] -0.994751

We see that the correlation between item difficulties and total endorsements per item is nearly perfect -.97. As the number of endorsements go down, the estimated difficulty of the item increase.

ggplot(item_prop, aes(x=total_endorsed, y=xsi.item)) + 
  geom_point() +
  ylab("Estimated Item Difficulties (logits)") +
  xlab("Total Number of Endorsements for an item") +
  ggtitle("Relationship between estimated item difficulty and total endorsements")