PROMIS® Reference Populations

PROMIS measures use scores that have meaning. A PROMIS score of 50 is the average (or mean) score for a specific, relevant group of people (e.g., the U.S. general population, kids with a painful condition). That group is the reference population.

 

PROMIS measures use scores that are more than just numbers: PROMIS scores have meaning. For example, a PROMIS measure score of 50 is the mean score for that measure. But average for whom? This is where “score meaning” comes in. Each PROMIS measure has its own reference population, that is, its own specific group of people (e.g., the U.S. general population, kids with a painful condition) who have been sampled and then thoroughly assessed with the PROMIS measure. The average score of 50 for that PROMIS measure is their average score. Thus, when someone is newly assessed with a PROMIS measure, and we observe that her score is 50, we have a score value, and we can say that her score is the same score as the average score for the PROMIS measure’s reference population – the population group to which we turn to or refer for score meaning.

Reference Populations

A T-score is a standardized score, like z-scores and IQ scores. All standardized scores have a “middle” score; it is zero for z-scores, 100 for IQ scores, and 50 for T-scores. This middle score is the mean of a large sample that is representative of a relevant population—a reference population. The large sample used to represent the reference population is called the Centering Sample.

For some PROMIS measures the reference population (and the centering sample) was a clinical population. This is the case for PROMIS Smoking measures, for which the reference populations was daily smokers; the centering sample was a sample of daily smokers. For many PROMIS measures, the reference population was the 2000 General US Census. The centering sample was a large sample of individuals who represented the 2000 US General Census.

Centering Sample and Calibration Sample

It is helpful to remember that the middle score of a standard score range has to be defined. For measures that use a T-score metric, 50 is the mean and 10 is the standard deviation, but they do not start out that way. The scores are first estimated using an item response model and the IRT-calibrated scores are transformed to a T-score metric using a linear transformation. But first you have to decide which score on the IRT metric is going to be the middle score—a score of 50. This is done by collecting scores from a large sample that represents the reference population and then calculating the mean for that sample. That sample is the centering sample.  That score becomes the middle score (e.g., 50 for T-scores). A linear transformation spaces all other scores along the continuum so they have the correct values relative to the middle score (mean of the centering sample) used to represent the middle score.

IMPORTANT: The Centering Sample and the Calibration Sample may not be the same sample.

The purpose of a calibration sample is to estimate item parameters (item characteristics such as difficulty and discrimination) using an item response theory model. Here’s where it can get confusing. Sometimes a single sample was used as both the calibration sample AND the centering sample. Other times one sample was used as the calibration sample and another was used as the centering sample.  Sometimes, a subset of the calibration sample served as the centering sample.

The Reference Population tables show the calibration and the centering samples for PROMIS. Most users will be particularly interested in the last column (Centering Sample). If you want to know what a score in the middle is (e.g., 50 for those scored on a T-score metric), go to the Centering Sample column. For example, if you go to the row for PROMIS-Cancer-Anxiety you will see that the item parameters were estimated (calibrated) using a hybrid of individuals with cancer and individuals from the general population. BUT, the centering sample was the general population. A score of 50 on this measure is comparable to the general population average level of anxiety.

What does the middle score mean?

When developing a measure with standard scores, an important consideration is what the middle score means. The scores of such measure are purposefully “centered” at the mean of a specific sample or subsample. PROMIS uses T-score, so the middle score is always 50. Centering scores in this way allows quick interpretation of where an individual is on a symptom or outcome compared to others in the reference population. A score of 50 on PROMIS Fatigue, for example, is comparable to the U.S. “average”. T-scores have a standard deviation of 10, so a score of 60 would indicate fatigue that is a standard deviation higher than the U.S. average.

TIP : Failure to be specific about the reference population invites confusion.


This can all get very confusing because sometimes the calibration sample (the sample used to estimate item response theory parameters) and centering sample (the sample used to define the middle of the score range) were the same. But sometimes they were different. For example, a measure may be calibrated in a clinical sample but then centered in the general population. The mean of T=50 for that measure reflects the average in the general population, not the clinical sample.

 PROMIS Adult
Item Bank/Scale Calibration Sample Centering Sample
Global Mental General population General population
Global Physical General population General population
Emotional Distress - Anger General population General population
Emotional Distress - Anxiety General Population General population
   PROMIS-Cancer - Anxiety Hybrid* General population
Emotional Distress - Depression General population General population
   PROMIS-Cancer - Depression Hybrid* General population
Cognitive Function General Population General population
Cognitive Function - Abilities General Population General population
Psychosocial Illness Impact - Positive Clinical sample (cancer) Clinical sample (cancer)
Psychosocial Illness Impact - Negative Clinical sample (cancer) Clinical sample (cancer)
Self-Efficacy - General General population General population
Self-Efficacy - Manage Emotions Clinical sample Clinical sample
Self-Efficacy - Manage Meds/Treatment Clinical sample Clinical sample
Self-Efficacy - Manage Social Interactions Clinical sample Clinical sample
Self-Efficacy - Manage Daily Activities Clinical sample Clinical sample
Self-Efficacy - Manage Symptoms Clinical sample Clinical sample
Alcohol - Alcohol Use General population + Clinical sample General population + Clinical sample
Alcohol - Positive Consequences General population + Clinical sample General population + Clinical sample
Alcohol - Negative Consequences General population + Clinical sample General population + Clinical sample
Alcohol - Positive Expectancies General population + Clinical sample General population + Clinical sample
Alcohol - Negative Expectancies General population + Clinical sample General population + Clinical sample
Smoking - Coping Expectancies

All Smokers; Daily Smokers; Nondaily Smokers
Clinical sample (smokers drawn from the general population) Daily smokers
Smoking – Emotional/Sensory Expectancies

All Smokers; Daily Smokers; Nondaily Smokers
Clinical sample (smokers drawn from the general population) Daily smokers
Smoking – Negative Health Expectancies

All Smokers; Daily Smokers; Nondaily Smokers
Clinical sample (smokers drawn from the general population) Daily smokers
Smoking – Nicotine Dependence

All Smokers; Daily Smokers; Nondaily Smokers
Clinical sample (smokers drawn from the general population) Daily smokers
Smoking – Negative Psychosocial Expectancies

All Smokers; Daily Smokers; Nondaily Smokers
Clinical sample (smokers drawn from the general population) Daily smokers
Smoking – Social Motivations

All Smokers; Daily Smokers; Nondaily Smokers
Clinical sample (smokers drawn from the general population) Daily smokers
Substance Use - Appeal

Past 30 days

Past 3 months
Clinical and non-clinical substance users Clinical and non-clinical substance users
Substance Use - Severity

Past 30 days

Past 3 months
Clinical and non-clinical substance users Clinical and non-clinical substance users
Substance Use - Prescription Pain Medication Misuse Participants who reported possession of a prescription for pain medication and potential misuse of such a medication Participants who reported possession of a prescription for pain medication and potential misuse of such a medication
Life Satisfaction General population General population
Meaning and Purpose General population General population
Medication Adherence N/A N/A
Positive Affect General population General population
Fatigue General population General population
   PROMIS-Cancer - Fatigue General population General population
Pain - Behavior General population + Clinical sample General population
Pain - Interference General Population + Clinical sample General population
   PROMIS-Cancer - Pain Interference Hybrid* General population
Pain Intensity General population + clinical sample General population
Physical Function General population General population
   PROMIS-Cancer - Physical Function Hybrid* General population
   - Mobility General population General population
   - Upper Extremity General population General population
Physical Function for Samples with Mobility Aid Users Clinical Sample Clinical Sample
Sleep Disturbance General population + Clinical sample General population + Clinical sample
Sleep-Related Impairment General population + Clinical sample General population + Clinical sample
Satisfaction with Participation in Discretionary Social Activities (v1.0) General population General population
Satisfaction with Participation in Social Roles (v1.0) General population General population
Satisfaction with Social Roles and Activities (v2.0) General population General population
Ability to Participate in Social Roles and Activities (v2.0) General population General population
Companionship General population General population
Informational Support General population General population
Emotional Support General population General population
Instrumental Support General population General population
Social Isolation General population General population
Dyspnea – Activity Motivation NA - Pool NA - Pool
Dyspnea – Activity Requirements NA - Pool NA - Pool
Dyspnea – Airborne Exposure NA - Pool NA - Pool
Dyspnea – Assistive Devices NA - Pool NA - Pool
Dyspnea – Characteristics NA - Pool NA - Pool
Dyspnea – Emotional Response NA - Pool NA - Pool
Dyspnea – Functional Limitations COPD Sample COPD Sample
Dyspnea – Task Avoidance NA - Pool NA - Pool
Dyspnea – Time Extension NA - Pool NA - Pool
Dyspnea – Severity COPD Sample COPD Sample
Gastrointestinal – Belly Pain GI Sample (clinical sample + general population who reported at least 1 GI symptom) General Population who reported at least 1 GI symptom
Gastrointestinal – Bowel Incontinence NA - Pool NA - Pool
Gastrointestinal – Constipation GI Sample (clinical sample + general population who reported at least 1 GI symptom) General Population who reported at least 1 GI symptom
Gastrointestinal – Diarrhea GI Sample (clinical sample + general population who reported at least 1 GI symptom) General Population who reported at least 1 GI symptom
Gastrointestinal – Disrupted Swallowing GI Sample (clinical sample + general population who reported at least 1 GI symptom) General Population who reported at least 1 GI symptom
Gastrointestinal – Gas and Bloating GI Sample (clinical sample + general population who reported at least 1 GI symptom) General Population who reported at least 1 GI symptom
Gastrointestinal – Gastroesophageal Reflux GI Sample (clinical sample + general population who reported at least 1 GI symptom) General Population who reported at least 1 GI symptom
Gastrointestinal – Nausea and Vomiting GI Sample (clinical sample + general population who reported at least 1 GI symptom) General Population who reported at least 1 GI symptom
Itch – Activity & Clothing Clinical sample Clinical sample
Itch – Mood & Sleep Clinical sample Clinical sample
Itch – Interference Clinical sample Clinical sample
Itch – Quality Clinical sample Clinical sample
Itch – Scratching Behavior Clinical sample Clinical sample
Itch – Severity Clinical sample Clinical sample
Itch – Triggers Clinical sample Clinical sample
Pain Quality – Neuropathic Pain Clinical sample with painful conditions Clinical sample with painful conditions
Pain Quality – Nociceptive Pain Clinical sample with painful conditions Clinical sample with painful conditions
Sexual Function and Satisfaction: Anal Discomfort with Sexual Activity (for Sexually Active People) NA - Pool NA - Pool
Sexual Function and Satisfaction: Bother Regarding Sexual Function (Male and Female measures) NA - Pool NA - Pool
Sexual Function and Satisfaction: Erectile Function (for Sexually Active Men) Sexually active men from the general population and men with sexual dysfunction Sexually active men
Sexual Function and Satisfaction: Factors Interfering with Sexual Satisfaction NA - Pool NA - Pool
Sexual Function and Satisfaction: Interest in Sexual Activity NA - Pool NA - Pool
Sexual Function and Satisfaction: Oral Discomfort with Sexual Activity (for Sexually Active People) Sexually active adults from the general population and adults with sexual dysfunction Sexually active adults
Sexual Function and Satisfaction: Oral Dryness with Sexual Activity (for Sexually Active People) Sexually active adults from the general population and adults with sexual dysfunction Sexually active adults
Sexual Function and Satisfaction: Orgasm – Ability (for Sexually Active People) Sexually active adults from the general population and adults with sexual dysfunction Sexually active adults
Sexual Function and Satisfaction: Orgasm - Pleasure (for Sexually Active People) Sexually active adults from the general population and adults with sexual dysfunction Sexually active adults
Sexual Function and Satisfaction: Satisfaction with Sex Life Sexually active adults from the general population and adults with sexual dysfunction Sexually active adults
Sexual Function and Satisfaction: Screeners NA - Pool NA - Pool
Sexual Function and Satisfaction: Sexual Activities (Male and Female measures) NA - Pool NA - Pool
Sexual Function and Satisfaction: Therapeutic Aids for Sexual Activity (Male and Female measures) NA - Pool NA - Pool
Sexual Function and Satisfaction: Vaginal Discomfort with Sexual Activity (for Sexually Active Women) Sexually active adults from the general population and adults with sexual dysfunction Sexually active women
Sexual Function and Satisfaction: Vaginal Lubrication for Sexual Activity (for Sexually Active Women) Sexually active adults from the general population and adults with sexual dysfunction Sexually active women
Sexual Function and Satisfaction: Vulvar Discomfort with Sexual Activity – Clitoral (for Sexually Active Women) Sexually active adults from the general population and adults with sexual dysfunction Sexually active women
Sexual Function and Satisfaction: Vulvar Discomfort with Sexual Activity – Labial (for Sexually Active Women) Sexually active adults from the general population and adults with sexual dysfunction Sexually active women

* Hybrid: items that did not have DIF between general population and cancer patients used the PROMIS parameters. DIF items used cancer-based parameters. All items included in fatigue item bank did not have DIF and thus all used PROMIS parameters.

 PROMIS Pediatric
Bank/Scale Calibration Sample Centering Sample
Global Health General population General population
Emotional Distress - Anger (v3.0) General population General population
Emotional Distress - Anxiety (v3.0) General population General population
Emotional Distress - Depression (v3.0) General population General population
Cognitive Function General population General population
Life Satisfaction General population General population
Meaning and Purpose General population General population
Psychological Stress Experiences General population General population
Positive Affect General population General population
Stigma Clinical sample (Children with chronic conditions) Clinical sample (Children with chronic conditions)
Stigma - Skin Clinical sample (Children with chronic conditions) Clinical sample (Children with chronic conditions)
Fatigue (v3.0) General population General population
Itch (PIQ-C) Clinical sample (Children with skin conditions) General population
Pain - Behavior (v3.0) General population General population
Pain - Interference (v3.0) General population General population
Mobility (v3.0) General population General population
Upper Extremity (v3.0) General population General population
Sleep Disturbance General population General population
Sleep-Related Impairment General population General population
Physical Activity General population General population
Physical Stress Experience General population General population
Strength Impact General population General population
Asthma Impact Clinical sample Clinical sample
Peer Relationships (v3.0) General population General population
Pain Quality (v3.0) General population General population
Pain Quality - Affective (v3.0) General population General population
Pain Quality - Sensory (v3.0) General population General population
Family Relationships General population General population
 PROMIS Early Childhood Parent-Report
Bank/Scale Calibration Sample Centering Sample
Global Health General population General population
Anger/Irritability General population General population
Anxiety General population General population
Engagement – Curiosity General population General population
Engagement – Persistence General population General population
Depressive Symptoms General population General population
Physical Activity General population General population
Positive Affect General population General population
Self-Regulation – Flexibility General population General population
Self-Regulation – Frustration Tolerance General population General population
Sleep Health General population General population
Social Relationships (Child-Caregiver, Family, Peer) General population General population
 PROMIS Parent Proxy
Bank/Scale Calibration Sample Centering Sample
Global Health General population General population
Emotional Distress - Anger (v3.0) General population General population
Emotional Distress - Anxiety (v3.0) General population General population
Emotional Distress - Depression (v3.0) General population General population
Cognitive Function  General population General population
Life Satisfaction General population General population
Meaning and Purpose General population General population
Psychological Stress Experiences General population General population
Positive Affect General population General population
Stigma Clinical sample (Parents of children with chronic conditions) Clinical sample (Parents of children with chronic conditions)
Stigma - Skin Clinical sample (Parents of children with chronic conditions) Clinical sample (Parents of children with chronic conditions)
Fatigue (v3.0) General population General population
Itch Clinical sample (Parents of children with skin conditions) General population
Pain - Behavior (v3.0) General population General population
Pain - Interference (v3.0) General population General population
Mobility (v3.0) General population General population
Upper Extremity (v3.0) General population General population
Sleep Disturbance General population General population
Sleep-Related Impairment General population General population
Physical Activity General population General population
Physical Stress Experience General population General population
Strength Impact General population General population
Asthma Impact Clinical sample Clinical sample
Peer Relationships (v3.0) General population General population
Family Relationships General population General population

Norms

A unique aspect of PROMIS measures is their use of standardized scores that are centered on a relevant reference population. Such scores are called “normative” because their value represents how close or far away they are from a normative population. The word “norm” has different meanings for different contexts. Here, we are not talking about social “norms,” the behaviors we expect from others and ourselves in society. We are also not talking about “normal” per se, even though the term originated from its reference to a standard, bell-shaped distribution curve that labels everything in the vast middle as normal. We use the word norm without applying judgment as to the “normality” of any given score relative to the distribution of scores seen on the same measure in a large group of people. Sometimes we refer to the group as a “reference group” and similarly to “norms” as reference values , because they are points of reference from which to understand a given single score.

For example, Jensen et al published reference values for eight PROMIS domains for individuals with cancer. The mean Pain Interference score for people with cancer was 52. Learn more>>

For some PROMIS measures there are subpopulation norms.

Subpopulation Norms

Norms are based on the distribution of scores on a measure for a well-characterized and relevant population. For example, I am 5’6” tall. I might be interested in comparing my height to other people in the world, or maybe just the United States since that’s where I live. On the other hand, I am a woman and so average height without considering gender doesn’t really matter to me. I would be interested in knowing my height relative to other women in the United States. So I might be interested in comparing my height to the mean height of other persons in the United States. According to Wikipedia, the average height for women in the United States is 5’4” and therefore I’m feeling pretty tall right now. I would not feel quite so tall if I were comparing myself to men in the United States (5’9”); and, I would only be eye-to-eye with the average woman in the Netherlands (5’6”), which brings us, finally to normative score comparisons for health outcomes measures.

Let’s start with a distinction between norms that are used for comparative purposes and norms that are used to anchor a scale. Many of the PROMIS measures (see elsewhere on this site) are centered on the mean score of a sample of individuals that, collectively, matched the US 2000 General Census with respect to important demographics (e.g., gender, age, race/ethnicity, education). The beauty of folding the norms into the metric is that it is easy to interpret a score relative to the population whose norm was used. With the PROMIS scores that are centered on the US Census population, the mean is 50 and standard deviation is 10. So, if my fatigue score is 60, I know that I’m not just feeling tired; my fatigue is one standard deviation above the general population (or at least a sample that matched it). But, as of this writing, I’m 62 years old. I’d like to know how my fatigue compares to people my age. That’s where sub-norms can be helpful.

Sub-norms divide a relevant population into subgroups to aid interpretation of scores. Above, I compared my height to that of women in the US, not to all people in the world or even to all people in the US. I found it more relevant to compare my height to a sub-norm value—the mean height of women in the US.

It is theoretically possible to develop sub-norms for scores for a measure based on any relevant population; though such data collection can be expensive. As HealthMeasures continued be used and more data accumulates, however, it may become practical to develop many sub-norms for comparing and interpreting scores.

Fortunately, the initial norming sample for PROMIS was quite large and it was feasible to disaggregate by gender and age ranges in order to estimate sub-group norms. This was done in 2011 for comparing the fatigue and pain by age range in the general population to that of samples of individuals with disabilities. The gender and age range norms were calculated for all PROMIS measures developed in the first phase of PROMIS testing. These were never published but are provided here for users who are interested. Means, standard deviations, and frequencies by domain are reported in the tables below.

WARNING

The original PROMIS norming sample was not powered to develop subgroup norms. The user should pay particular attention to the size and characteristics of the sample used to develop each sub-norm. For example, much larger samples were used to calculate sub-norms for males and females than for the age ranges. More confidence is warranted for sub-norms estimated with larger sample sizes. Nevertheless, these sub-norms can be useful both in comparing samples and interpreting scores. For reference, consider how they were used in papers by Cook et al, and Molton et al.

Gender and Age Range Sub-norms for Adult PROMIS Measures Centered on the US General Census 2000

Domain   Gender Age in Years
  Female Male 18-34 35-44 45-54 55-64 65-74 75+
Anger N 1865 1204 730 565 499 495 401 379
Mean 50.6 49.1 53.0 51.5 50.4 48.8 47.5 45.7
SD 10.2 9.6 10.7 10.3 9.5 9.7 8.7 7.9
Anxiety N 1654 1069 659 496 417 442 365 345
Mean 50.9 48.6 52.4 50.9 50.1 49.3 48.1 46.9
SD 10.2 9.5 10.7 11.1 9.5 9.5 8.8 7.9
Depression N 1269 890 496 366 359 373 290 276
Mean 50.9 48.7 52.3 50.6 50.8 49.5 48.4 46.5
SD 10.1 9.7 10.9 10.9 10.0 9.7 8.8 7.2
Fatigue N 1884 1183 706 551 513 516 396 385
Mean 51.1 48.2 50.5 51.0 51.6 49.7 48.1 48.0
SD 10.1 9.6 9.7 10.7 10.1 10.8 9.3 8.3
Pain Behavior N 1851 1199 699 561 507 507 402 374
Mean 50.7 49.0 47.6 50.0 52.2 51.3 50.1 49.7
SD 10.1 9.7 10.2 10.6 10.1 9.7 9.3 8.7
Pain Interference N 1856 1180 712 548 499 488 406 383
Mean 51.1 48.3 47.8 50.1 51.9 51.6 49.9 49.7
SD 10.3 9.3 9.0 10.2 11.1 10.9 9.3 8.7
Physical Function N 2044 1363 782 605 567 565 457 431
Mean 48.9 51.7 55.1 52.0 49.0 47.5 47.2 45.6
SD 10.0 9.7 8.4 9.8 10.4 10.4 9.0 8.5
Global Mental Health N 3008 2206 1183 863 902 873 715 679
Mean 49.4 50.8 48.5 48.4 48.2 50.3 53.1 53.4
SD 10.0 10.0 9.7 10.4 10.3 10.5 8.8 8.4
Global Physical Health N 3015 2212 1182 865 910 875 713 683
Mean 49.1 51.2 51.6 50.1 48.2 48.8 51.0 49.9
SD 10.1 9.8 8.4 9.8 10.9 11.3 9.9 9.2

Percentiles

A percentile can be used to reflect how an individual’s score compares to a reference population. Carle and colleagues (2021) estimated percentiles for many PROMIS Pediatric and Parent Proxy v2.0 (Anger, Anxiety, Depressive Symptoms, Fatigue, Mobility, Pain Behavior, Pain Interference, Peer Relationships, Upper Extremity Function) and v1.0 (Family Relationships, Global Health, Life Satisfaction, Meaning and Purpose, Physical Activity, Physical Stress Experiences, Positive Affect, Psychological Stress Experiences, Sleep Disturbance, Sleep Impairment) measures.

Last updated on 4/29/2024