Psychometric Properties and Questions

All HealthMeasures utilized item response theory in their development. Information about measurement development and analyses can be found in publications and resources on this site.

IRT-based Reliability and Cronbach's Alpha

5 years 4 months ago #220

This was originally an emailed question, but a good one to share here as well. I'll post the abbreviated question here, and my response

We have calculated Cronbach’s alpha using raw scores. However, for the T-scores, should we calculate alpha, and if so, should we weight the raw scores using the item discrimination parameters. Or, should we instead calculate reliability for T-scores directly from estimating a 2-parameter IRT model instead of attempting to compute Cronbach’s alpha for T-scores?

Regarding the SEM, we assume then that for computing the SEM for T-scores, we should use whichever reliability estimate is most appropriate (Cronbach’s alpha or IRT model-based). Is this what you would advise?


Cronbach’s alpha should be viewed as the lower bound for internal consistency reliability. It has several limitations. But newer versions (like omega) are more complicated to calculate, especially in new samples.
Reliability is related to the standard error, like you said. But both are classical test theory concepts, were you look for one number summaries. IRT doesn’t work that way.
In IRT-based scores, reliability is conditional on the score estimate, so it will be different for every response pattern. Reliability equals 1-SE^2, where SE is the standard error of an estimated theta score (on the mean=0, sd=1 metric). You can calculate the average reliability, then, for your sample by calculating 1-mean(SE)^2 and use that. Alternatively, you could use an IRT program to calculate the marginal reliability across a specific theta region, given the existing item parameters. Choose sum-score based EAP scoring. The very bottom gives the marginal reliability given the properties of the items (and thus independent from your sample).
Marginal reliability and the associated SEM is one way to get at a one-number summary using IRT-based statistics. But in reality, each person and each score has its own SE/SEM and associated reliability. I’d favor those statistics as opposed alpha.

Please Log in or Create an account to join the conversation.

Moderators: HealthMeasures