Validation

Substantial qualitative and quantitative evidence supports the validity of PROMIS® measures.

Have PROMIS measures been validated?

This question implies a simple “yes” or “no” answer. Unfortunately, asking if a measure has been validated is not a simple question with a simple answer. An instrument can never be “valid” in any unqualified sense. One might expect an expertly developed fourth grade math test to be valid for discriminating math proficiency in fourth graders. Validity, however, would not extend to discriminating math ability in high school seniors or social studies knowledge in fifth graders.

The strength of validity evidence is judged in the courtroom of scientific opinion. In fact, the judicial process provides a serviceable metaphor. Validation is the process of building a case for a measure. Psychometric studies are undertaken and their results serve as “character witnesses” that reveal the level and nature of a measures’ usefulness in different populations and for different purposes.

What evidence is there for the validity of PROMIS measures?

Substantial qualitative and quantitative evidence has been gathered that supports the validity of PROMIS measures. Here are some highlights:

Content Validity

What Anastasi said in the context of educational testing is relevant to health measures: “content validity is built into a test from the outset through the choice of appropriate items.”1 The content validity of PROMIS measures began in the use of patient interviews and reviews by expert review panels. Details have been published. Once candidate items are developed, they undergo empirical testing and many are dropped from the final measure. This could result in loss of content validity. A follow-up study, however, found that PROMIS domain names and definitions remained generally representative of the item banks.  Content validity can also be evaluated for specific patient populations after a measure is developed. For example, Forrest and colleagues (2020) evaluated the content validity of pain interference, fatigue, sleep disturbance, and sleep-related impairment measures in children with chronic kidney disease and Crohn’s disease. 

1Anastasi A, 1988. Psychological Testing, New York, Macmillan Publishing Company, p. 122-127.

Cross-Sectional Validity Evidence

A triplet of articles reported the results of validity evaluations conducted on the first wave of PROMIS data collection. Cella, et al., 2010, reported results of initial validation studies of 11 PROMIS item banks measuring components of self-reported physical, mental, and social health, along with a 10-item Global Health Scale. Liu, et al., 2010 evaluated the representativeness of the Internet panel that was used for initial calibration of the PROMIS item banks. Rothrock, et al., 2010 compared PROMIS scores across chronic condition variables (e.g., number of comorbidities, whether condition was disabling). Many additional publications since then have expanded the evidence for validity of PROMIS measures.

Responsiveness to Change

As person’s symptoms and function change over time, scores on PROMIS measures are expected to also change. Demonstrating responsiveness to change is done by assessing individuals over time. Often, effect sizes and standardized response means are calculated. There is evidence that PROMIS measures capture change over time in different clinical contexts (e.g., patients starting a new treatment for depressionpatients with rheumatoid arthritis initiating a disease-modifying antirheumatic drugpatients with spinal disorders, and children with asthma). Evidence for specific PROMIS measures and patient populations is readily obtainable using a PubMed search

Clinical Validity Evidence

Validity evidence for PROMIS scores has accumulated in myriad of diseases and clinical conditions. Such evidence is readily obtainable using a PubMed search.

In 2016, a series of articles was published in the Journal of Clinical Epidemiology that evaluated the function of different PROMIS measures across clinical contexts (see overview paper for a summary). The validity evidence presented in this series of papers was unique in that each paper evaluated a single PROMIS domain across multiple conditions in “real-world” clinical settings. Researchers may wish to use the results in judging the likely appropriateness of a given measure for their target clinical context. Results can be used as external anchors to support comparative effectiveness research.

Validation in Pediatric Chronic Disease Populations

The NIH-funded Pediatric Patient Reported Outcomes in Chronic Diseases (PEPR) Consortium conducted multiple validation studies of PROMIS pediatric measures in diverse populations including:

TIP : Conduct a literature search to identify current validity evidence for PROMIS measures.

Making the case for using PROMIS

If you are considering a PROMIS measure in a clinic or for a study, you will have substantial evidence to evaluate. The questions you ask about that evidence can also serve as the framework for supporting your choice to others (e.g., administrators, granting agency). Here are some questions you should consider:

  • Why is it important to measure this construct or these constructs in my study or clinic? Describe the relevance of the symptom or outcome to the population of interest.
  • What psychometric evidence has accumulated when this measure was used in my targeted population? If you are not able to find a study in your population, weigh the evidence that exists for the measure across populations.
  • It is appropriate to consider evidence gathered about the validity of a measure, even if a different assessment strategy was used (e.g., one short form versus another, CAT versus short form). You should keep in mind that shorter short forms sacrifice some reliability for reduction in response burden.
  • What are the alternatives to the PROMIS measures? State clearly why you believe a PROMIS measure is a good choice, particularly for your population and for you particular purpose. Remember, validity resides in the use of the scores.

How to describe psychometric evidence for a grant proposal

Below is an example of reporting the psychometric properties of a PROMIS instrument—PROMIS Sleep Disturbance from 2014. Remember, however, that the tone, length, and focus are context dependent. Consider the audience when describing the properties of a measure.

PROMIS Sleep Disturbance (PROMIS–SD)

The PROMIS-SD items assess self-reported perceptions of sleep quality, sleep depth, and restoration associated with sleep. This includes perceived difficulties and concerns with getting to sleep or staying asleep, as well as perceptions of the adequacy of, and satisfaction with, sleep. Sleep Disturbance does not focus on symptoms of specific sleep disorders; those symptoms are addressed in the PROMIS Sleep Related Impairment bank. The PROMIS-SD has demonstrated excellent validity as evidenced in associations with disease activity, depression, female sex, smoking, and use of corticosteroids or narcotics (N=3173; inflammatory bowel disease) (Ananthakrishnan, 2013), ability to distinguish among those with and without sleep disorders (Buysse, 2010), and prediction (along with negative affect) of global ratings of improvement in back pain (Karp, 2014). PROMIS-SD scores predicted return of active disease in a subsample of patients with Crohn’s disease (N=1291) in remission at baseline (Ananthakrishnan, 2013). Those with sleep disturbance, as measured by the PROMIS-SD, had a 2-fold increase in risk of active disease at six months (adjusted odds ratio, 2.00). The PROMIS-SD has been tested and exhibited validity evidence (e.g., expected associations, discrimination among known groups) in a wide range of populations including, but not limited to, parents in neonatal ICU (Busse, 2013), individuals with neurological conditions (Cook, 2012), patients with pelvic pain (Fenton, 2011), and head and neck cancer (Stachler, 2014).

 

Last updated 4/21/2023