- Posts: 75
This isn't specific to prospective studies, but in 2019, Segawa, Schalet, and Cella compared PROMIS CATs with 4-, 6-, and 8-item short forms. Specifically, they evaluated the range of accurate scores, number of items administered, floor (level of the worst symptom/function a measure can quantify), and ceiling (level of the best symptom/function a measure can quantify).
• CATs offer the widest range of accurate scores.
• CATs averaged 4.7 items administered.
• 4-, 6-, and 8-item short forms offer a wide range of accurate scores. Figure 1 provides the specific range for each short form. This can be used to evaluate how well each short form is likely to cover the desired range of scores for a target patient sample.
• Use CATs “(1) when a substantial number of participants with extremely poor health is anticipated; (2) when there is a need to measure very healthy participants accurately;” and (3) “when the administration of small number of items is required” (p. 217).
Segawa, E., Schalet, B., & Cella D. (2019). A comparison of computer adaptive tests (CATs) and short forms in terms of accuracy and number of items administered using PROMIS profile. Quality of Life Research, 29: 213-221. doi: 10.1007/s11136-019-02312-8
Thank you for the detailed reply and suggested references. My orthopedic clinic has been issuing older static PROMIS forms to patients for prospective evaluation (pre- and post-operatively), and we have the opportunity to switch to CAT forms in the near future. I think making the switch might give us some interesting data to compare, as well as reducing response burden, etc.
Item-level comparisons were key to my initial query to HM.net, but my message was not clear. When we have evaluated individual PROMIS items to ostensibly related individual legacy PRO items, there are some pretty neat findings.
Thank you again for the reply!
The evidence suggests that PROMIS forms are interchangeable (Choi et al. 2010), but a small amount of error or difference is introduced when switching, so it is preferable not to switch forms (of any kind) in the midst of a longitudinal study if it can be helped. While autocorrelation would suggest that SFs are more “comparable” or stable than CATs over time, any number of factors could influence the ability to detect and quantify change. We cannot conclude one is better than another in longitudinal studies comparing scores over time. It is highly contextual, so it may not be possible to generalize across banks and different SFs. But change seems similar in studies with both PROMIS SF8 and CAT in this study (Flynn et al, 2015). For a fuller discussion, see also Devine et al. (2016). Of course, if you need to compare changes in symptoms/function at the item level (which might be useful in some clinical contexts), CATs wouldn’t allow you to do that.
Flynn, K., Dew, M., Lin, L., Fawzy, M., Graham, F., Hahn, E., . . . Weinfurt, K. (2015). Reliability and Construct Validity of PROMIS® Measures for Patients with Heart Failure Who Undergo Heart Transplant. Quality of Life Research, 24(11), 2591-2599. 10.1007/s11136-015-1010-y
Choi, S., Reise, S., Pilkonis, P., Hays, R., & Cella, D. (2010). Efficiency of Static and Computer Adaptive Short Forms Compared to Full-Length Measures of Depressive Symptoms. Quality of Life Research, 19(1), 125-136. dx.doi.org/10.1007/s11136-009-9560-5
Devine, J., Fliege, H., Kocalevent, R., Mierke, A., Klapp, B. F., & Rose, M. (2016). Evaluation of Computerized Adaptive Tests (CATs) for longitudinal monitoring of depression, anxiety, and stress reactions. Journal of Affective Disorders, 190, 846-853.
Hi, I am hoping a moderator or researcher viewing this forum can point me to published articles or grey literature describing rationale and evidence that supports a preference for short forms over computer adaptive testing for prospective assessment. It seems obvious that static items would be more directly comparable over time, but is there a published study that supports this?