Rasch models

Rasch models (Taken from William Revelle's "Short Guide to R")

R can be used to fit a Rasch model in Item Response Theory (IRT). The Rasch model was proposed in the 1960's by a Danish statistician Georg Rasch. The basic Rasch model is used to separate the ability of test takers and the quality of the test. In this section we will be going over how R can be used to estimate IRT models.

Take the following table as an example. The table shows the results of 35 students taking an 18-item ability test. Each item is coded as correct (1) or incorrect (0). The data come from the first example of a popular free software called MiniStep. MiniStep is the evaluation version of a larger, more complete software called WinStep. Ministep can be downloaded from http://www.winstep.org. It works only under the Windows environment.

At the bottom of the table is a row showing the percentages of students who answer each question correctly. Questions 1, 2, and 3 are the easiest because all students get it right. Question 18 is the hardest because none of the students knows the answer to it. Questions like these are not useful in distinguishing students who possess high ability and students who possess low ability. Questions like these are therefore omitted from the calculation.

   name q1 q2 q3  q4  q5  q6  q7  q8  q9 q10 q11 q12 q13 q14 q15 q16 q17 q18
Richard  1  1  1   1   1   1   1   0   0   0   0   0   0   0   0   0   0   0
 Tracie  1  1  1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   0
 Walter  1  1  1   1   1   1   1   1   1   0   0   1   0   0   0   0   0   0
 Blaise  1  1  1   1   0   0   1   0   1   0   0   0   0   0   0   0   0   0
    Ron  1  1  1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   0
William  1  1  1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   0
  Susan  1  1  1   1   1   1   1   1   1   1   1   1   1   0   1   0   0   0
  Linda  1  1  1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   0
    Kim  1  1  1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   0
  Carol  1  1  1   1   1   1   1   1   1   1   1   0   0   0   0   0   0   0
   Pete  1  1  1   0   1   1   1   1   1   0   0   0   0   0   0   0   0   0
 Brenda  1  1  1   1   1   0   1   0   1   1   0   0   0   0   0   0   0   0
   Mike  1  1  1   1   1   0   0   1   1   1   1   1   0   0   0   0   0   0
   Zula  1  1  1   1   1   1   1   1   1   1   1   0   0   0   0   0   0   0
  Frank  1  1  1   1   1   1   1   1   1   1   1   1   1   0   0   0   0   0
Dorothy  1  1  1   1   1   1   1   1   1   0   1   0   0   0   0   0   0   0
    Rod  1  1  1   1   0   1   1   1   1   1   0   0   0   0   0   0   0   0
Britton  1  1  1   1   1   1   1   1   1   1   0   0   1   0   0   0   0   0
  Janet  1  1  1   1   1   1   1   1   1   0   0   0   0   0   0   0   0   0
  David  1  1  1   1   1   1   1   1   1   1   0   0   1   0   0   0   0   0
 Thomas  1  1  1   1   1   1   1   1   1   1   1   0   1   0   0   0   0   0
  Betty  1  1  1   1   1   1   1   1   1   1   1   1   0   0   0   0   0   0
   Bert  1  1  1   1   1   1   1   1   1   1   0   0   1   1   0   0   0   0
   Rick  1  1  1   1   1   1   1   1   1   1   1   0   1   0   0   1   1   0
    Don  1  1  1   0   1   1   0   0   0   0   0   0   0   0   0   0   0   0
Barbara  1  1  1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   0
   Adam  1  1  1   1   1   1   1   0   0   0   0   0   0   0   0   0   0   0
 Audrey  1  1  1   1   1   1   1   1   1   0   1   0   0   0   0   0   0   0
   Anne  1  1  1   1   1   1   0   0   1   1   1   0   0   1   0   0   0   0
   Lisa  1  1  1   1   1   1   1   1   1   0   0   0   0   0   0   0   0   0
  James  1  1  1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   0
    Joe  1  1  1   1   1   1   1   1   1   1   1   0   0   0   0   0   0   0
 Martha  1  1  1   1   0   0   1   0   0   1   0   0   0   0   0   0   0   0
  Elsie  1  1  1   1   1   1   1   1   1   1   0   1   0   1   0   0   0   0
  Helen  1  1  1   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
----------------------------------------------------------------------------
         1  1  1 .91 .88 .86 .88 .77 .86 .69 .34 .17 .20 .09 .03 .03 .03 0.0

We also need to examine if there are students who get all or none of the items right. If we omit items 1, 2, 3, and 18, and we take a look at the the 0's and 1's for each student, then we see that Helen fails on all remaining items. Helen therefore tells us nothing about which item is harder and which is easier. All items are beyond Helen's ability. Students who can get all items right also do not give us useful information. But in this example none of the students can answer all items correctly.

This is how many specific-purpose statistical packages prepare the raw data for IRT analysis. These packages set aside any items or persons that provide no useful information for the analysis. They analyze the IRT model with the remaining data. From the solutions derived from the remaining data, these packages extrapolate to come up with estimates for items and persons first set aside.

In R we can use a conditional logit model to perform IRT analyses. The conditinal logit model is in the survival analysis library maintained by Thomas Lumley at University of Washington. A description of the survival analysis library and the additional packages it depends on can be obtained by the R command library(help=survival). The R codes offered in this section have been tested with version 2.11-4 of the survival library in R-1.9.0.

The original derivation by Rasch was different but equivalent to a fixed-effect conditional logit model. Note that the following procedures in R are suitable only for binary items (e.g., pass/fail on test items).

The Rasch model states that the log odds of a person answering an item correctly is a function of the person's ability ( $\theta_s$ ) and the item's difficulty ( $\beta_i$ ). Difficult items are hard to get right even for people with high ability. The odds of getting an item right decreases with item difficulty (and thus the minus sign before $\beta_i$ ). The notations we use here are from Embretson and Reise (2000).

$\begin{displaymath} {\rm logit}(p_{i,s}) = \log({\Pr(p_{i,s}) \over \Pr(1-p_{i,s})}) = \theta_s - \beta_i, \end{displaymath}$

where

$\begin{displaymath} s = 1, \ldots, {\rm number\ of\ persons}; \quad i = 1, \ldo... ...ms}; \quad {\rm and\ } \theta_s \approx {\rm Normal}(0, \tau). \end{displaymath}$

The example data, after removing Helen and items 1, 2, 3, and 18, contain 34 students ( $s = 1, 2, \ldots 34$ ) and 17 items ( $i = i, 2, \ldots 17$ ). Richard is person 1, Tracie is person 2, ..., and so on. To evaluate Richard's chance of answering item 3 correctly, we take Richard's abiliy (say, it is 3 in logit unit), substract it out with item 3's difficulty (say, it is 2.2 in logit unit), and we get a 0.8 logit. We can convert the logit scale back to probability by taking exp(0.8)/(1 + exp(0.8)) and we get 0.69. Richard has a 69% chance of getting item 3 right.

The question is, how do we calculate the $\theta$ 's and the $\beta$ 's in the equation?

We need to first load the survival analysis library by typing library(survival). Then we need to set aside uninformative items and persons. This is done by the following commands:

> library(survival)
> exam1 <- read.csv(file="exam1.dat", sep="", header=T, row.names=NULL)
> exam1$id <- factor(paste("s", 1:dim(exam1)[1], sep=""))
> #
> # percentage of Ss who can correctly answer the items
> #
> round(apply(exam1[, 3:20], 2, mean), 2) # items 1, 2, 3 and 18 are no good
  q1   q2   q3   q4   q5   q6   q7   q8   q9  q10  q11  q12  q13  q14  q15  q16 
1.00 1.00 1.00 0.91 0.89 0.86 0.89 0.77 0.86 0.69 0.34 0.17 0.20 0.09 0.03 0.03 
 q17  q18 
0.03 0.00 
> round(apply(exam1[, 7:19], 1, mean), 2) # Helen got every good item wrong 
   1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16 
0.23 0.46 0.46 0.15 0.46 0.46 0.77 0.46 0.46 0.54 0.38 0.31 0.46 0.54 0.69 0.46 
  17   18   19   20   21   22   23   24   25   26   27   28   29   30   31   32 
0.38 0.54 0.38 0.54 0.62 0.62 0.62 0.77 0.15 0.46 0.23 0.46 0.46 0.38 0.46 0.54 
  33   34   35 
0.15 0.62 0.00 
> exam1[35, ]
    name sex q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18  id
35 Helen   F  1  1  1  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0 s35
> # remove items 1, 2, 3, and 18 because they provide no useful information
> names(exam1)
 [1] "name" "sex"  "q1"   "q2"   "q3"   "q4"   "q5"   "q6"   "q7"   "q8"  
[11] "q9"   "q10"  "q11"  "q12"  "q13"  "q14"  "q15"  "q16"  "q17"  "q18" 
[21] "id"  
> exam1.1 <- exam1[, -c(3,4,5,20)]
> # remove Helen, who answered all good items incorrectly
> exam1.1 <- exam1[ -dim(exam1)[1], ]

To run the conditional logit model clog(), we need to reshape the data from a wide format to a long format. The reshape() command takes q4:q17 and organize them into a new variable called resp. Another new varible item is created to identify questions 4 to 17.

> exam1.1 <- reshape(exam1.1, varying=list(paste("q", 4:17, sep="")), 
+                    timevar="item", v.names="resp", direction="long")

In the equation the item difficulty parameters (the $\beta$ 's) have a minus sign before them. In clog() a parameter with a minus sign is represented by a dummy variable of -1. For example, in the long form the items are identified by the variable item. An excerpt of the full data shows that Martha answered item 1 correctly, Elsie also answered item 1 correctly. The answers of Martha and Elsie for item 1 are followed by answers of Richard, Tracie, and Walter to the next item (item 2).

> exam1.1[34:38, ]
         name sex  id item resp
s33.1  Martha   F s33    1    1
s34.1   Elsie   F s34    1    1
s1.2  Richard   M  s1    2    1
s2.2   Tracie   F  s2    2    1
s3.2   Walter   M  s3    2    1

Using the above data excerpt as an example, we can create dummy variables for items 1 and 2. The R command is tt <- factor(exam1.1[34:38, "item"]); dummy <- diag(nlevels(tt))[tt, ] and you have:

> cbind(exam1.1[34:38, ], 0 - dummy)
         name sex  id item resp  1  2
s34.1   Elsie   F s34    1    1 -1  0
s35.1   Helen   F s35    1    1 -1  0
s1.2  Richard   M  s1    2    1  0 -1
s2.2   Tracie   F  s2    2    1  0 -1
s3.2   Walter   M  s3    2    1  0 -1

Note that the last two columns are added with cbind() and they can be used to code the $\beta$ parameters for items 1 and 2, respectively. When this is applied to the 476 rows in the data set, it produces a dummy variable matrix of 476 rows and 14 columns. Each column of the matrix will be used to estimate the item difficulty associated with each of questions 4 to 17.

> ex.i.dummy <- diag(nlevels(factor(exam1.1$item)))[factor(exam1.1$item), ]
> ex.i.dummy <- 0 - ex.i.dummy   # turns (0, 1) into (0, -1)
> ex.i.dummy <- data.frame(ex.i.dummy, row.names=NULL)
> dimnames(ex.i.dummy)[[2]] <- paste("i", 4:17, sep="")
> dim(ex.i.dummy)
[1] 476  14

Finally, a call to clogit() completes the calculations. Worth noting is how the syntax is written. The strata() function tells clogit() to estimate item difficulty associated with each dummy variable by holding constant the influence across different persons. The syntax also specifies that the item difficulty paramaters be estimated for all items except i4. This is because i4 is the reference level when all dummy variables are zero. The clogit() function is not able to define a reference level if i4 is added. The clogit() function with i4 does not know what to do when all dummy variables are zero.

> attach(ex.i.dummy)   # attach the dummy variables so I can call i4:i17
> # item 4 is the reference
> exam2.clog<- clogit(resp ~ i5+i6+i7+i8+i9+i10+i11+i12+i13+i14+
+         i15+i16+i17+ strata(id), data=exam1.1)
> summary(exam2.clog)
Call:
clogit(resp ~ i5 + i6 + i7 + i8 + i9 + i10 + i11 + i12 + i13 + 
    i14 + i15 + i16 + i17 + strata(id), data = exam1.1)

  n= 476 

     coef exp(coef) se(coef)     z       p
i5  0.514      1.67    1.025 0.501 6.2e-01
i6  0.923      2.52    0.991 0.931 3.5e-01
i7  0.514      1.67    1.025 0.501 6.2e-01
i8  1.875      6.52    0.955 1.964 5.0e-02
i9  0.923      2.52    0.991 0.931 3.5e-01
i10 2.598     13.44    0.944 2.752 5.9e-03
i11 4.517     91.54    0.954 4.736 2.2e-06
i12 5.792    327.53    1.028 5.632 1.8e-08
i13 5.537    253.95    1.010 5.484 4.1e-08
i14 6.825    920.74    1.136 6.007 1.9e-09
i15 8.095   3278.95    1.403 5.772 7.8e-09
i16 8.095   3278.95    1.403 5.772 7.8e-09
i17 8.095   3278.95    1.403 5.772 7.8e-09

    exp(coef) exp(-coef) lower .95 upper .95
i5       1.67   0.597954     0.224      12.5
i6       2.52   0.397477     0.361      17.6
i7       1.67   0.597954     0.224      12.5
i8       6.52   0.153373     1.004      42.4
i9       2.52   0.397477     0.361      17.6
i10     13.44   0.074425     2.112      85.5
i11     91.54   0.010924    14.120     593.5
i12    327.53   0.003053    43.650    2457.7
i13    253.95   0.003938    35.104    1837.0
i14    920.74   0.001086    99.293    8538.0
i15   3278.95   0.000305   209.826   51240.2
i16   3278.95   0.000305   209.826   51240.2
i17   3278.95   0.000305   209.826   51240.2

Rsquare= 0.525   (max possible= 0.659 )
Likelihood ratio test= 355  on 13 df,   p=0
Wald test            = 108  on 13 df,   p=0
Score (logrank) test = 277  on 13 df,   p=0

With the i4 being the reference level, the coeffiencts associated with the dummy variables should be interpreted as the differences in item difficulty between i4 and each of i5 to i17. For example, the 0.514 associated with i5 means that i5 is more difficult (remember, higher $\beta$ means higher difficulty) than i4 by 0.514. In addition, items 15, 16, and 17 are equally difficult. They are more difficult than i4 by 8.095 logits. This translates to a reduction in the probability of answering items 15, 16, and 17 correctly. On a probability scale, this 8.095 logit is equivalent to 0.999 reduction in probability (try exp(8.095) / ( 1 + exp(8.095))).

The clogit() function in R is general-purpose routine designed for fitting a conditional logit model. Thus it does not automatically print out many other statistics specific to Rasch model. For example, the summary() function does not automatically generate the estimated ability level for each person. Other packages like MiniStep does that for you. But clogit() in R provides estimates of item difficulty, which are the most important information for developing surveys or tests.