Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Jul 1.
Published in final edited form as: J Allergy Clin Immunol Pract. 2020 May 4;8(7):2341–2350.e1. doi: 10.1016/j.jaip.2020.04.048

Development and Preliminary Validation of a New Patient Reported Outcome Measure for Chronic Rhinosinusitis (CRS-PRO)

Saied Ghadersohi 1, Caroline PE Price 1, Sally E Jensen 2, Jennifer L Beaumont 3, Robert C Kern 1,4, David B Conley 1, Kevin C Welch 1, Anju T Peters 4, Leslie C Grammer III 4, Whitney W Stevens 1,4, Alexis M Calice 1, Elizabeth Stanton 1, Marisa K VanderMeeden 1, Robert P Schleimer 1,4, Bruce K Tan 1,4
PMCID: PMC7448958  NIHMSID: NIHMS1601953  PMID: 32376490

Abstract

BACKGROUND

Patient reported outcome (PRO) measures developed and validated on patients with the currently defined phenotypes of chronic rhinosinusitis (CRS) are needed to support clinical trials in CRS.

OBJECTIVE

This study developed and examined the initial reliability and validity of the CRS-PRO a new patient reported outcome measure of CRS.

METHODS

Instrument development was performed through structured interviews and focus groups with clinical experts and 45 CRS patients meeting current definitions of disease, 21 without polyps (CRSsNP) and 24 with nasal polyps (CRSwNP) to identify items important to patients. Then another 50 patients (32 with CRSsNP and 18 CRSwNP) with stable CRS symptoms were enrolled to evaluate reliability of the instrument. Each patient completed the CRS-PRO, SNOT-22, and four PROMIS short forms at the baseline visit and then at least 7 days later.

RESULTS

After the development process, 21 items were identified from the conceptual domains of physical symptoms, sensory impairment, psychosocial effects and life impact. Using the responses of the 50 CRS patients, 21 draft items were further refined to 12 items by eliminating conceptually similar or highly correlated items or those with low mean symptom severity. The 12-item questionnaire was shown to have excellent internal consistency (Cronbach-α 0.86), and test-retest reliability with a high intra-class correlation coefficient (ICC 0.89) and Pearson’s correlation (r=0.82, p<0.0001). The 12-item CRS-PRO correlated highly with the longer SNOT-22 (r=0.83, p<0.0001) demonstrating its concurrent validity. We also demonstrated validity and reliability in a separate analysis for CRSsNP and CRSwNP patients.

CONCLUSION

The CRS-PRO is a concise, valid and reliable measure that was developed with extensive input from CRS patients with current disease definitions.

Keywords: Chronic rhinosinusitis, patient reported outcome measure, reliability, validity, CRS-PRO, nasal polyps

Introduction

Chronic Rhinosinusitis (CRS) is an inflammatory disease of the nasal airways and paranasal sinuses. Since 2007, it has been defined by 12 weeks of CRS-appropriate symptoms (nasal obstruction, nasal discharge, facial pain/pressure, and reduced/lost smell) together with objective findings on computed tomography (CT) or endoscopic examination with evidence of inflammation and/or purulence.1, 2 In population-based surveys of European and United States populations, between 10.9–11.9% of respondents had CRS-appropriate symptoms although the percentage of these with CT or endoscopic evidence of inflammation is less defined.3, 4 CRS significantly impacts patients’ quality of life (QOL) and drives treatment expenditure of $10 billion annually in direct costs and another approximately $13 billion or more in indirect costs.5, 6 Clinically, CRS is frequently further categorized into CRS with nasal polyps (CRSwNP) and CRS without nasal polyps (CRSsNP), with evidence from multiple centers demonstrating that the pathophysiology may differ significantly. Despite the high burden of disease, there remain limited options for treating CRS with topically applied or implanted nasal corticosteroids and a single monoclonal antibody for CRSwNP being the only medications approved to treat this condition.

Patient reported outcome (PRO) instruments are measures of disease severity and symptom burden in chronic diseases. PRO instruments provide the patient’s own interpretation of their health status without modification by a clinician and can serve as endpoints for drug labeling applications to regulatory agencies such as the Food and Drug Administration (FDA).7 In this most recent guidance, the FDA specifies that acceptable PRO instruments need to demonstrate both qualitative and quantitative data that they were developed and validated with extensive input from patients with the condition. While multiple PRO instruments have been developed to study rhinologic conditions including CRS, a systematic review by Rudmik et al. finds that many suffer from inadequate evaluation of psychometric properties or were developed using input from patients who had nasal conditions other than CRS. In their review, the Sinonasal outcome test (SNOT-22) was noted to be the best currently available PRO instrument based on development and psychometric quality analysis.8

However, the SNOT-22 has several shortcomings. First and most pertinent to current FDA guidelines on PRO measures is that the SNOT-22 was not developed using input from patients with current definitions of CRS. The SNOT-22 was a modification of the SNOT-20 instrument which itself was a derivative of the Rhinosinusitis outcome measure (RSOM-31).9, 10 The focus groups assembled to develop the symptom items in the RSOM-31 and SNOT-20 were in 1992–1993 and 1995–1996 respectively. In the description of the study populations, participating patients were diagnosed with rhinosinusitis (duration not specified) using symptoms alone with many participants described as having rhinitis with or without chronic sinusitis. Given the first consensus definition of CRS was published in 1997, and this subsequently underwent major changes to the requisite symptoms and required objective findings on nasal endoscopy or radiographic imaging starting in 2003–2007, it is likely that many participating patients had allergic rhinitis or migraine disorders given the imprecision of symptoms for establishing a CRS diagnosis. Furthermore, in the two published studies on the RSOM/SNOT-20 development, it is uncertain whether any patients with CRSwNP participated in the development of the symptom items. We postulate that their exclusion may have caused major symptoms like nasal blockage and smell loss to get dropped in the SNOT-20 and only reinstated in the SNOT-22 after input from the clinicians who developed this modification. Further sub analyses has found that the SNOT-22 contains items which have very low baseline severity (e.g. “dizziness” or “sneezing”), compound symptom items with disparate interpretation (e.g. “nasal blockage/congestion”) or broad emotional items with no disease context (e.g. “sad”) to be included.

With these shortcomings in mind, we sought to develop and validate a new PRO instrument using contemporary developmental methodology for PRO development.11 In concordance with current FDA guidance on PROs, we solicited extensive input from patients with current consensus definitions of CRS and its two major clinical phenotypes to better measure the CRS patient experience.1, 2, 7 We hypothesized that this instrument could reliably measure symptom fluctuations in CRS during exacerbations and resolution following treatment. In this study, we discuss the development and initial psychometric validation including internal consistency, test-retest reliability and performance/validity of our CRS-PRO instrument compared to general health measures, including Patient-Reported Outcomes Measurement Information System (PROMIS) short forms, the EuroQol 5-Dimension (EQ-5D) and the more CRS-specific SNOT-22.

Methods

Instrument Development

The development of the patient-reported outcome measure was a multidisciplinary process soliciting extensive input from clinicians treating CRS and patients with CRS using methodology previously published.11 The instrument development portion of the CRS-PRO was deemed IRB exempt by the Northwestern Institutional Review Board (IRB). The process began with a comprehensive literature review by the study group on existing legacy instruments for measuring chronic and episodic upper airway disease. Legacy instruments of interest for CRS, upper respiratory infections, and allergic rhinitis (e.g. Rhinosinusitis disability index (RSDI), SNOT, Chronic sinusitis survey (CSS), Wisconsin upper respiratory symptom survey (WURSS) respectively) were examined.1216 To gather physician input, otolaryngologists, allergists and advanced practice providers from the Northwestern Sinus and Allergy Center of Otolaryngology were interviewed in a focus group to identify critical concepts and identify potential content gaps of existing legacy instruments utilized in CRS. Primary care physicians treating patients with CRS also participated in structured interviews and focus groups to fully understand the perspectives of clinicians who commonly encounter CRS patients.

Patient-centered focus groups or individual patients with CRS were recruited from the Northwestern Sinus and Allergy Center. Overall 45 patients were recruited for the focus groups and had a current diagnosis of CRS as defined by consensus guidelines at their most recent clinic visit with a requirement that patients had both symptomatic and objective (radiographic and endoscopic) findings of CRS. We also specifically assembled separate groups of patients across the spectrum of disease to ascertain whether the patient experience differed between CRS patients with and without nasal polyps or those with disease requiring sinus surgery. In focus groups led by a health psychologist, patients answered open-ended questions discussing their experience with CRS and identifying its impact on their daily life activities and quality of life. Patients were also asked to rank the top three symptom and impact items and comment on an appropriate recall period for the PRO measure. Patients’ responses were recorded and transcribed verbatim for qualitative analysis. The contents from the patient and clinician focus groups were used to create a symptom index and conceptual framework for measuring the condition. Existing PROs were then evaluated to identify developed items that overlapped with the content identified by the CRS patients. Where no appropriate symptom items could be identified, new items were written based on patient wording identified in the focus group transcripts. In general, items were chosen if they were discussed in three of the four groups of patients and if they were listed among the top three CRS-related impairments by two of the patient groups. The newly drafted CRS-PRO items then underwent extensive evaluation based upon cognitive interviews with patients with CRS to ensure newly written items accurately described symptoms and were uniformly interpreted by the patients. The remaining items then underwent a translatability review to minimize culture-specific language such as “I had a runny nose” that was better represented by “I had mucus in my nose” since translations of “runny nose” was similar to having a cold in some languages. Similarly, items that were linguistically like each other were removed to reduce redundancy.

Validation of the CRS PRO

Approval for the validation of the CRS-PRO was obtained from the Northwestern University Institutional Review Board and informed consent was obtained from each enrolled subject. We prospectively enrolled a second, separate set of 50 patients between November 2016 to April 2017 from the Northwestern Medical Group Department of Otolaryngology Sinus Center clinic, a tertiary care referral center. All patients were at least 18 years old, and had CRS as defined by the EPOS and International consensus statement (ICAR-RS) criteria including documented radiographic and/or endoscopic evidence of CRS. We specifically sought patients with stable CRS symptoms for whom no new pharmacologic or surgical interventions were planned between assessments.1 Exclusion criteria included limited English comprehension and patients who initiated or were planning on initiating medical therapy (excluding topical therapy) in the two weeks prior to enrollment. Many of these patients had failed medical therapy and were enrolled in the study in the period while awaiting sinus surgery. We collected baseline patient information including age, gender, CRS subtype (with or without polyps), history of asthma, allergic rhinitis, aspirin sensitivity, smoking, endoscopic sinus surgery. Additionally, the most recently available CT Lund-Mackay scores and endoscopic scores (i.e. from the same visit or a previous visit) were recorded to ensure patients had objective evidence of CRS.17

The 21-item draft CRS-PRO instrument was administered electronically to the 50 patients with stable CRS symptoms on “day 0” along with the SNOT-22 and general QoL measures from four PROMIS short forms (Satisfaction with Participation in Social Activities 7a; Fatigue 6a; Pain-Intensity 6a; and Sleep Disturbance 6a) and lastly the EQ-5D-5L with a visual analog scale (VAS).16, 18, 19 PROMIS (Patient reported outcomes measurement information system) is a set of patient centered measures rigorously developed by the NIH to evaluate physical, mental and social health in chronic conditions. Each PROMIS short form has a calibrated scoring scheme that generates a T-score where 50 represents the census-representative general population mean. We used the T-scores for our analysis. For the EQ-5D-5L, we used the VAS score for further analysis. The VAS score is a method prescribed by the EQ-5D scoring guide for presenting health status measured on this instrument.20 Study data were collected and managed using REDCap electronic data capture tools hosted at Northwestern University Feinberg School of Medicine.21 REDCap (Research Electronic Data Capture) is a secure, web-based application designed to support data capture for research studies.

On “day 7”, these same instruments excluding the EQ-5D were administered via e-mail. Any missing data points (0.3% of total data points from all questionnaires) were imputed based on the average for that measure. Alternatively, if there were thematically related items across PRO instruments that were completed, (i.e. “nose blocked” item in SNOT-22 and “difficulty breathing through nose” item in CRS-PRO instrument) the response on the completed item was used in place of the missing value.

Statistical analysis

All statistical analyses were performed using Stata 14.1 (Statacorp, College Station, TX). Demographic and clinical characteristics were reported as frequencies and percentages for categorical variables, and means, standard deviations and range for continuous and symptom variables. Cronbach’s coefficient alpha for the instrument and the instrument without each item was calculated to evaluate internal consistency reliability. A Cronbach’s alpha >0.70 is considered to be adequate.22 Test-retest reliability was estimated by calculating the intra-class correlation coefficient (ICC), and Pearson’s correlation coefficient between baseline and follow-up instrument scores. An ICC greater than 0.75 is considered good, greater than 0.90 is considered excellent.23 Significance was determined at p <0.05. We calculated that a sample size of 50 allowed us, with 80% assurance, to estimate the ICC with a 95% confidence interval width of 0.2, assuming the true test-retest reliability is at least 0.80. Concurrent validity and the performance of the instrument was further assessed by comparison of the CRS-PRO instrument to the SNOT-22 and PROMIS instruments using a Pearson correlation coefficient. Pearson correlation coefficients of >0.7 were considered strong, 0.5–0.7 were considered moderate and <0.5 were considered weak.

Results

Instrument development

Focus groups involved 45 patients with CRS. To assess if CRS subtype and/ or surgical intervention affected patients’ responses, four separate groups of patients were interviewed CRSsNP (n=7) and CRSwNP (n=6) with no prior surgery; CRSsNP (n=14) and CRSwNP who had prior surgery (n=18). There were no differences between sex, age, or race demographics in the groups (Table 1). Qualitative analyses of the structured interview and focus group transcripts were carried out until content saturation was achieved. The patient quotes obtained from the focus groups were then used to develop a conceptual framework for identified CRS impact using items rated as most important by patients and clinicians. We found that the transcripts from the CRSsNP and CRSwNP focus group interviews, as well as the surgically and non-surgically managed patients, had similar symptom and impact items that were identified, although slight differences were noted in the symptoms rated as most important by patients with these subtypes. The conceptual framework identified impairment in 4 domains comprised of 9 “physical symptom” concepts, 3 “sensory impairment” concepts, 7 “psychosocial effects” concepts, and 5 “life impact” concepts using the transcript contents (Figure 1). Using the concepts derived from the transcripts, a total of 65 existing items from high quality validated PROs were identified as relevant to CRS patients. An additional 27 new items were written where gaps in existing content were identified. After cognitive interviews with CRS patients to determine consistency of interpretation, the 92 items were reduced to 21 items derived from the 4 domains: physical symptom, sensory impairment, psychosocial effects and life impact using patient preference. Using the cognitive interview transcripts, we also chose not to test items to represent ) the concepts of “headache”, “auditory/hearing”, “financial impact”, “physical function impact” or “workplace impact” since some involved patients strongly felt they were not affected by these concepts. Of the items, 10 items had previously been developed for use in other PROs developed using similar methodology and 11 items had been newly written. Of the 11 newly written items, l· 8 items were thematic dyads of 4 physical symptom concepts that cognitive interviews suggested patients felt were important. Since patients were equivocal about which item better represented 5 their symptoms, two items were tested to evaluate their validity. For a recall period, qualitative analysis of patient transcripts suggested involved patients were split between the choice of a 7-day and a 1 month recall period. Recall periods shorter or longer than these intervals were deemed inappropriate by patients. We decided to utilize a recall period of 7 days as it was consistent with the recall period utilized in the PROMIS short forms and may better capture the rapid changes associated with treatment or exacerbations. All items were then tested during ! cognitive interviews using the 7-day recall.

Table 1.

Instrument development participant characteristics. Subjects with these characteristics shared their experience living with CRS in focus groups or individually using a structured interview script

Characteristic CRSsNP, Non-Surgical N (%) CRSwNP, Non-Surgical N (%) CRSsNP, Surgical N (%) CRSwNP, Surgical N (%) Total N (%)

N 7 (16%) 6 (13%) 14 (31%) 18 (40%) 45
Age, Mean (SD) 42 44 47 49 45
Gender
 Female 4 (57%) 3 (50%) 12 (86%) 8 (44%) 27 (60%)
 Male 3 (43%) 3 (50%) 2 (14%) 10 (56%) 18 (40%)
Ethnicity
 Non-Hispanic 5 (71%) 6 (100%) 12 (86%) 17 (94%) 40 (89%)
 Hispanic 0 0 0 1 (6%) 1 (2%)
 Unknown 2 (29%) 0 2 (14%) 0 4 (9%)
Race
 Asian 0 1 (17%) 0 0 1 (2%)
 African American 1 (14%) 0 0 3 (17%) 4 (9%)
 White 6 (86%) 5 (83%) 13 (93%) 13 (72%) 37 (82%)
 Other 0 0 0 0 0
 Unknown 0 0 1 (7%) 2 (11%) 3 (7%)

Figure 1:

Figure 1:

Conceptual model of health related impairment in CRS developed from patient focus groups. Individual items from CRS-PRO and the sinonasal outcome test-22 (SNOT-22) are mapped under the appropriate concepts to illustrate item coverage in each instrument. By comparison, the SNOT-22 disproportionately covers certain concepts (i.e. sleep quality) and covers concepts that were found to have low baseline severity which were excluded from the CRS-PRO.

Instrument validation

A total of 50 patients were enrolled in the reliability portion of the study and completed administrations of the CRS-PRO, SNOT-22, EQ-5D-5L and PROMIS short forms. Average patient age was 45.6 years (range 21 to 74). Twenty-seven (54%) patients were male and 23 (46%) were female. CT scores were available in 38 patients with an average Lund-Mackay score of 8.7 (SD 4.1). Endoscopic findings were reported in all 50 patients with 43 patients (86%) noted to have endoscopic evidence of edema, drainage or polyps. All patients had CT or endoscopic objective evidence of CRS per consensus guidelines. Eighteen of the patients (36%) had CRS with polyps (CRSwNP), while 32 (64%) had CRS without polyps (CRSsNP). The baseline demographics of all patients and based on CRS type is summarized in Table 2. Consistent with known disease demographics, patients with CRSwNP were older, and more likely to have undergone previous ESS and have comorbid asthma.

Table 2:

Baseline demographics of instrument validation patients

All Patients CRSsNP CRSwNP

N (%) 50 (100%) 32 (64%) 18(36%)
Age (mean±SD (range)) 45.6±15.7 (21-74) 41.1±14.8 (21-71)* 53.7±14.5 (27-74)
Race
 White 42 (84%) 26 (81%) 16 (89%)
 African American 5 (10%) 4 (12%) 1 (6%)
 Asian 2 (4%) 1 (3%) 1 (6%)
 Unknown 1 (2%) 1 (3%) 0 (0%)
Gender
 Male 27 (54%) 17 (53%) 10 (57%)
 Female 23 (46%) 15 (47%) 8 (44%)
Hx of ESS 18 (36%) 7 (22%)* 11 (61%)
Asthma 14 (28%) 5 (16%)* 9 (50%)
AERD 2 (4%) 0 (0%) 2 (11%)
Allergies 23 (46%) 15 (47%) 8 (44%)
Smoking status
 Never smoker 36 (72%) 22 (69%) 14 (78%)
 Former smoker 10 (20%) 7 (22%) 3 (17%)
 Current smoker 4 (8%) 3 (9%) 1 (6%)
*

p<0.05

CRSsNP: CRS without polyps

CRSwNP: CRS with polyps

ESS: Endoscopic sinus surgery

AERD: Aspirin exacerbated respiratory disease

Acceptability and item reduction for the CRS-PRO instrument

The 21 item CRS-PRO instrument (Table 3) was well received by patients with only nine (0.4%) missing data points in the entire group across both administrations of the 21-item CRS-PRO instrument. Comparatively, the SNOT22 was also received well by patients with eight (0.3%) missing data points. The baseline mean of each item in the CRS-PRO was calculated in order to evaluate the average frequency among CRS patients (Table 3). First, we removed the two impact items (Item 20: “Because of my illness, some people avoided me” and 21: “because of my illness, I felt embarrassed in social situations”) as they had low mean scores 0.36 and 1.1, respectively. Items 9 (“I had mucus in my nose”) and 10 (“I had mucus dripping from my nose”) were dyads of the concept of mucus in the nose. In addition, item 13 (“My symptoms kept me awake at night”) and 15 (“My sleep was refreshing”) were dyads of the concept of sleep impairment. We chose item 9 and 13 in each dyad as the mean scores were higher for these items indicating they had more patient impact. Likewise, for items 1 (“I had difficulty breathing through my nose”) and 2 (“My nose was blocked”); and items 7 (“I had trouble clearing mucus from my throat”) and 8(“I had mucus in my throat”), items 1 and 8 were chosen for higher baseline symptom severity. A correlation matrix (Pearson’s correlation) of each of the 21 initial items was also created. We noted that the physical domain symptom dyads had high inter-item correlations >0.75, further justifying their removal. In addition, the psychosocial items 16–19 similarly had high correlations (Pearson’s>0.75), 16 (“I felt anxious about the uncertainty of my chronic rhinosinusitis”) and 18 (“I felt overwhelmed by my condition”) were removed to reduce redundancy despite being derived from separate concepts.

Table 3:

Mean (±SD) symptom burden for each item on the 21-item draft CRS-PRO at baseline. Symptoms were scored 0–4, with 4 representing the most severe (note that Item 12 and 15 are reverse scored). The italicized items denoted by XXX were removed in the abbreviated 12-item CRS-PRO after consideration of their individual psychometric properties.

Baseline (mean±SD)

Physical Symptoms
1) I had difficulty breathing through my nosea 2.4±1.4
XXX 2) My nose was blockeda 2.3±1.3
3) I felt pressure in my faceb 1.7±1.4
4) My face hurtb 1.3±1.4
5) I had to blow my nose 2.6±1.3
6) I have been coughing 1.7±1.4
XXX 7) I had trouble clearing mucus from my throatc 1.8±1.4
8) I had mucus in my throatc 2±1.3
9) I had mucus in my nosed 2.9±0.98
XXX 10) I had mucus dripping from my nosed 1.9±1.3
Sensory Impairment
11) I had problems with my sense of smell 2.0±1.3
XXX 12) I was able to enjoy the taste of food. 1.6±1.3
Psychosocial Effects
13) My symptoms kept me awake at night. 1.5±1.3
14) I felt fatigued 1.7±1.4
XXX 15) My sleep was refreshing 2.5±1.2
XXX 16) I felt anxious about the uncertainty of my chronic rhinosinusitis 2±1.3
17) I worried that my condition will get worse. 2.1±1.4
XXX 18) I felt overwhelmed by my condition. 1.5±1.5
19) I was frustrated by my condition. 2.3±1.5
Impact
XXX 20) Because of my illness, some people avoided me. 0.36±0.69
XXX 21) Because of my illness, I felt embarrassed in social situations 1.1±1.2
a,b,c,d

denotes dyads of rhinologic symptoms assessing a similar construct.

Internal Consistency Reliability

We next assessed the internal consistency of our measure with Cronbach’s coefficient alpha (Table E1). Item 12 “I was able to enjoy the taste of food,” although a pertinent question to CRS patients, had negative item-rest correlation and was detrimental to the internal consistency of the instrument. Based on this analysis we removed item 12 from the final instrument. We verified that the removal of the previously mentioned redundant or low severity items was not significantly detrimental to the instrument’s overall internal consistency. This resulted in an overall 12-item CRS-PRO instrument with a Cronbach’s alpha of 0.86, with item-rest correlations ranging from 0.15 to 0.71. By comparison, if all 21 CRS-PRO items were utilized, the Cronbach’s alpha of the resulting instrument would be 0.90, with item-rest correlations ranging from 0.05 to 0.76. Although the smell item had the lowest correlation (r= 0.15) to the rest of the instrument, given the importance of smell as a cardinal symptom of CRS, it was retained. The Cronbach’s alpha for the longer SNOT-22 was 0.93 suggestive that both instruments met the threshold for robust internal consistency.

Instrument scoring

After finding the highly correlated and low prevalence items, we tested two methods of scoring the CRS-PRO instrument; based on a total score respectively summing the 12- or 21- individual item responses (Figure 2).

Figure 2:

Figure 2:

12-item CRS-PRO instrument with a 5-point Likert scale scoring system. Symptoms were scored 0–4, with 4 representing the most severe.

Test-Retest Reliability

The test-retest reliability was assessed using responses from the initial and follow-up visits. Follow-up time between initial and subsequent visits was 8.0 days (Range 7–29 days). The Pearson’s correlation between the 21-item CRS-PRO scores from initial and follow-up visits was 0.84 (p<0.01) and for the 12-item CRS-PRO was 0.82 (p<0.01). By comparison the correlation for the longer SNOT-22 was 0.83 (p<0.01). The Intraclass correlation coefficient was calculated using an absolute agreement two-way mixed effects model, the 21-item CRS-PRO was an average ICC 0.90 (CI 0.81 to 0.95, F=11.2, p<0.0005) and the 12-item CRS-PRO was an average ICC 0.89 (CI 0.80 – 0.94, F=10.01, p<0.0005). For comparison the same ICC absolute agreement two-way mixed effects model was calculated for the SNOT-22 scores between the two time points with an average ICC of 0.91 (CI 0.84 to 0.95, F=10.57, p<0.0005).

Concurrent Validity

We assessed the concurrent validity of the CRS-PRO instrument by comparing to the validated CRS specific SNOT-22 instrument. We noted a high Pearson’s correlation of 0.83 (p<0.00005) between the SNOT-22 and the 12-Item CRS-PRO instrument (Table 4). Comparing to the general PRO measures, the EQ-5D-5L visual analog scale score was noted to correlate more strongly with the 12-item CRS-PRO instrument (Pearson correlation −0.57, p<0.01) and the 21-item CRS-PRO (Pearson correlation −0.54, p<0.01), compared to the SNOT-22 (Pearson correlation −0.44, p<0.01). The PROMIS fatigue short form had moderate correlation with the overall instruments. (Table 4).

Table 4:

Instrument and subdomain correlations to general patient reported outcome measures

SNOT-22 EQ-5D PROMIS fatiguea PROMIS sleepb PROMIS painc PROMIS satisfactiond

CRS-PRO 21-item 0.81** −0.54** 0.65** 0.39** 0.61** −0.52**
CRS-PRO 12-item 0.83** −0.57** (−0.77 to −0.38) 0.65** 0.39** 0.60** −0.48**
*

p<0.05

**

p<0.01. Patient Reported Outcome Measurement Information System (PROMIS) short forms:

a

Fatigue 6a

b

Sleep Disturbance 6a

c

Satisfaction with Participation in Social Activities 7a

d

Pain-Intensity 6a were scored and converted into respective T-scores prior to correlation.

Reliability of the CRS-PRO instrument among patients with CRSsNP and CRSwNP

Both instruments demonstrated excellent internal consistency and reliability in both CRS subtypes. Table 5 shows excellent Cronbach’s alpha coefficients, Pearson’s correlation coefficients and ICC scores using an absolute agreement two-way mixed effects model for both patients with CRSwNP and CRSsNP. (Table 5)

Table 5:

Reliability assessment of the CRS-PRO and SNOT-22 in both CRS with and without polyps subgroups.

CRSsNP CRSwNP

Cronbach’s alpha
CRS-PRO 21 item 0.91 0.88
CRS-PRO 12-Item 0.87 0.82
SNOT-22 0.93 0.94
Test-Retest Correlation (Pearson’s Correlation)
CRS-PRO 21 item 0.83** 0.82**
CRS-PRO 12-Item 0.80** 0.83**
SNOT-22 0.75** 0.91**
ICC
CRS-PRO 21 item 0.90 (F=10.5, p<0.0005) 0.88 (F=10.54 P0.0005)
CRS-PRO 12-Item 0.88 (F=8.94 P<0.0005) 0.88 (F=9.8 P0.0005)
SNOT-22 0.85 (F=6.75 P<0.0005) 0.96 ( F=21.3 P0.0005)
*

p<0.05

**

p<0.01.

Cronbach’s coefficient alpha evaluates internal consistency reliability. A Cronbach alpha >0.70 is considered to be adequate.

ICC- intra-class correlation coefficient, an ICC greater than 0.75 is considered good, greater than 0.90 is considered excellent.

Discussion

In this study, we describe the development and initial assessments of reliability and validity of a new CRS-specific patient reported outcome measure, the CRS-PRO. We had hypothesized that it could reliably measure the symptoms and psychosocial impact of CRS. Per FDA guidance on the development of patient reported outcomes, the development process extensively solicited and documented patient input from patients who met current consensus diagnostic guidelines of CRS, including its two major clinical phenotypes, CRSwNP and CRSsNP.1, 2, 7 This ensured the patient experience was accurately translated. We tested a draft instrument comprised of 21 items identified as important by patients in focus groups but refined this to 12 items based on evaluation of their measurement properties in a CRS patient cohort (CRSsNP, & CRSwNP). While PRO measures with more items will statistically be more reliable, the CRS-PRO is almost half the length of the SNOT-22 and demonstrated comparable metrics of reliability (internal consistency and test-retest reliability), and concurrent validity. With its shorter length, the CRS-PRO will minimize the cognitive strain of completing PRO measures and may permit more frequent administration.11 Furthermore, the CRS-PRO demonstrated a stronger correlation with the general health measure the EQ-5D visual analog scale score than the SNOT-22.

With a greater focus on patient-centered care, there is a need for patient reported outcome measures (PROMs) that accurately inform clinicians and investigators of health care delivery. General health related QoL measures (e.g. the EQ-5D, PROMIS) allow for the comparison of disease burden between chronic diseases. Disease-specific instruments can capture specific CRS parameters that are of clinical significance and play an important role in clinical trials, providing the patient perspective. The FDA guidelines for PROMs used as endpoints in clinical trials, state that PROMs used for labeling claims must demonstrate development and modification based on patient input, be developed using patients with the disease being studied, must have a recall period specific and appropriate to the disease, and must have demonstrated validity, reliability and responsiveness data.7 To our knowledge, there are no PROMs in CRS that meet all these FDA criteria. Although the CRS-PRO will require acceptance by the FDA as having met these criteria, it was developed with the prescribed methodology provided by their guidance statement.7

CRS specific PROMs have significant variability and patient involvement in their development. The SNOT-22, currently the highest quality and most widely used CRS-specific instrument, was first developed for use in a large multi-centered study of sino-nasal surgery by the Royal College of Surgeons of England.24 The SNOT-22 was derived from the SNOT-20 instrument which itself traces its lineage to the RSOM-31. Although the RSOM-31 did use patient structured interviews in its development, the patients involved likely do not meet current definitions of disease, included very few (2–3%) patients with CRSwNP and had a substantial percentage of patients with rhinitis and not CRS. Furthermore, subsequent modifications in symptom wording and item elimination (e.g. the SNOT-20 or SNOT-22) were physician-driven with little documented CRS patient input.10 As illustrated in Figure 1, although the SNOT-22 largely maps to the conceptual model we developed for the CRS-PRO, it weights certain concepts disproportionately with several items representing a single concept (e.g. 3 items to represent sleep quality). The SNOT-22 also retains a number of concepts that were not reported by patients during development or dropped from the CRS-PRO due to low reported severity. In fact, in further analysis of our cohort the SNOT-22 items such as “dizziness,” “ear pain,” “sad,” and “embarrassed” had baseline severity scores less than 1 on the Likert scale (Table E2). We believe the careful choice of individual items to represent each concept in the CRS-PRO results in a shorter instrument that maintains metrics of reliability while covering the spectrum of impairment experienced by patients with CRS. This conceptual model also is an important framework recommended in the FDA guidance document.7 We hypothesize that the CRS-PRO may subsequently contain less subdomains of symptoms than the SNOT-2225

In our development process, we also specifically sought to evaluate whether patients with CRSsNP and CRSwNP experienced symptoms differently. Patients with CRSwNP are disproportionately affected by certain symptoms (e.g. smell loss, lower airway symptoms) and the SNOT-22 included very limited or undocumented participation from this important subset of patients.1 During the development of the instrument, qualitative analysis of the focus group transcripts found that CRSsNP and CRSwNP groups shared a similar conceptual framework and symptoms although there were subtle differences in the most important symptoms. In this cohort, the CRSsNP had an overall higher disease burden as measured by both the CRS-PRO and SNOT-22 (data not shown). In addition, as would be expected, CRSwNP patients had higher smell loss scores while those with CRSsNP had higher physical/psychological impact from their disease.

A limitation of this study is the possibility of inserting expert interpretation bias into the development process. However, every effort was made with the design and validation of the CRS-PRO to include patient input at every step and use statistical methods based on the patient data to guide expert decisions. In addition, although the CRS-PRO was developed using conceptual subdomains, we were unable to perform a factor analysis for validation of the subdomains as a factor analysis typically requires a sample size of several hundred before becoming stable. Therefore only the total 12-item instrument score is the recommended score for interpretation.

This manuscript reports the first part of the validation process for the CRS-PRO instrument. The second study will examine the convergent validity of the instrument compared to objective findings and the responsiveness of the instrument to clinical changes in patients undergoing appropriate medical therapy.

Conclusions

This study demonstrates the preliminary validity and reliability of the CRS-PRO instrument which is developed with significant patient input in both development and evaluation. We hope that the CRS-PRO, pending further evaluation this instrument, will provide a sensitive tool to assess clinical changes in CRS patients to enhance patient care and become acceptable for use in FDA clinical trials.

Supplementary Material

supplementary files

Highlights box.

1. What is already known about this topic?

PRO measures quantify health related impairment and can serve as endpoints for clinical trials in CRS. Few PRO instruments for CRS are developed with documented input from patients with current CRS definitions and are not accepted by the FDA as an endpoint for clinical trials.

2. What does this article add to our knowledge?

The 12-item CRS-PRO is a new disease specific PRO measure of CRS. It was developed using extensive input from patients meeting current definitions of CRS, is more concise than current PRO measures (e.g. SNOT-22) and possesses equivalent reliability and validity.

3. How does this study impact current management guidelines?

The validated CRS-PRO was developed in concordance with current FDA guidelines on PRO measures. This study, along with developmental documentation, can be used to support acceptance of the CRS-PRO as an endpoint for clinical trials.

Acknowledgments

REDCap is supported at FSM by the Northwestern University Clinical and Translational Science (NUCATS) Institute, Research reported in this publication was supported, in part, by the National Institutes of Health’s National Center for Advancing Translational Sciences, Grant Number UL1TR001422. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

SUPPORTED BY: The National Institutes of Health Grants: U01 AI106683-Chronic Rhinosinusitis Integrative Studies Program (CRISP), K23DC012067, R01DC016645 and the Ernest S. Bazley Foundation

Abbreviations

BSIT

Basic Smell identification test

CI

confidence interval

CRS

Chronic Rhinosinusitis

CRSsNP

Chronic rhinosinusitis without nasal polyps

CRSwNP

Chronic rhinosinusitis with nasal polyps

CSS

Chronic sinusitis survey

CT

Computed Tomography

EPOS

European position paper on rhinosinusitis and nasal polyps

EQ-5D

EuroQol 5-dimension questionnaire

FDA

Food and Drug administration

ICAS-RS

International consensus statement on allergy and rhinology: rhinosinusitis

ICC

Intraclass correlation coefficient

IRB

Institutional Review Board

HRQOL

Health related quality of life

NIH

National institute of health

NPIF

Nasal peak inspiratory test

PROM

Patient reported outcome measure

PROMIS

Patient reported outcome measurement information system

QoL

Quality of life

SD

standard deviation

SNOT-22

Sinonasal outcome test - 22

RSDI

Rhinosinusitis disability index

RSOM-31

Rhinosinusitis outcome measure

VAS

Visual analog scale

WURSS

Wisconsin upper respiratory symptom survey

Footnotes

Disclosures: None

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Fokkens WJ, Lund VJ, Mullol J, Bachert C, Alobid I, Baroody F, et al. EPOS 2012: European position paper on rhinosinusitis and nasal polyps 2012. A summary for otorhinolaryngologists. Rhinology 2012; 50:1–12. [DOI] [PubMed] [Google Scholar]
  • 2.Orlandi RR, Kingdom TT, Hwang PH, Smith TL, Alt JA, Baroody FM, et al. International Consensus Statement on Allergy and Rhinology: Rhinosinusitis. Int Forum Allergy Rhinol 2016; 6 Suppl 1:S22–209. [DOI] [PubMed] [Google Scholar]
  • 3.Hirsch AG, Stewart WF, Sundaresan AS, Young AJ, Kennedy TL, Scott Greene J, et al. Nasal and sinus symptoms and chronic rhinosinusitis in a population-based sample. Allergy 2017; 72:274–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Tomassen P, Newson RB, Hoffmans R, Lotvall J, Cardell LO, Gunnbjornsdottir M, et al. Reliability of EP3OS symptom criteria and nasal endoscopy in the assessment of chronic rhinosinusitis--a Ga(2) LEN study. Allergy 2011; 66:556–61. [DOI] [PubMed] [Google Scholar]
  • 5.Rudmik L Economics of Chronic Rhinosinusitis. Curr Allergy Asthma Rep 2017; 17:20. [DOI] [PubMed] [Google Scholar]
  • 6.Smith KA, Orlandi RR, Rudmik L. Cost of adult chronic rhinosinusitis: A systematic review. Laryngoscope 2015; 125:1547–56. [DOI] [PubMed] [Google Scholar]
  • 7.Health USDo, Human Services FDACfDE, Research, Health USDo, Human Services FDACfBE, Research, et al. Guidance for industry: patient-reported outcome measures: use in medical product development to support labeling claims. Health Qual Life Outcomes 2009; 4:79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Rudmik L, Hopkins C, Peters A, Smith TL, Schlosser RJ, Soler ZM. Patient-reported outcome measures for adult chronic rhinosinusitis: A systematic review and quality assessment. J Allergy Clin Immunol 2015; 136:1532–40 e1–2. [DOI] [PubMed] [Google Scholar]
  • 9.Piccirillo JF, Edwards D, Haiduk A, Yonan C, Thawley SE. Psychometric and Clinimetric Validity of the 31-Item Rhinosinusitis Outcome Measure (RSOM-31). American Journal of Rhinology 1995; 9:297–306. [Google Scholar]
  • 10.Piccirillo JF, Merritt MG Jr., Richards ML. Psychometric and clinimetric validity of the 20-Item Sino-Nasal Outcome Test (SNOT-20). Otolaryngol Head Neck Surg 2002; 126:41–7. [DOI] [PubMed] [Google Scholar]
  • 11.Cella D, Hahn EA, Jensen SE, Butt Z, Nowinski CJ, Rothrock N, et al. In: Patient- Reported Outcomes in Performance Measurement. Research Triangle Park (NC); 2015. [PubMed] [Google Scholar]
  • 12.Banglawala SM, Schlosser RJ, Morella K, Chandra R, Khetani J, Poetker DM, et al. Qualitative development of the sinus control test: a survey evaluating sinus symptom control. Int Forum Allergy Rhinol 2016; 6:491–9. [DOI] [PubMed] [Google Scholar]
  • 13.Barrett B, Brown R, Mundt M, Safdar N, Dye L, Maberry R, et al. The Wisconsin Upper Respiratory Symptom Survey is responsive, reliable, and valid. J Clin Epidemiol 2005; 58:609–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Benninger MS, Senior BA. The development of the Rhinosinusitis Disability Index. Arch Otolaryngol Head Neck Surg 1997; 123:1175–9. [DOI] [PubMed] [Google Scholar]
  • 15.Gliklich RE, Metson R. Techniques for outcomes research in chronic sinusitis. Laryngoscope 2015; 125:2238–41. [DOI] [PubMed] [Google Scholar]
  • 16.Hopkins C, Gillett S, Slack R, Lund VJ, Browne JP. Psychometric validity of the 22-item Sinonasal Outcome Test. Clin Otolaryngol 2009; 34:447–54. [DOI] [PubMed] [Google Scholar]
  • 17.Lund VJ, Mackay IS. Staging in rhinosinusitus. Rhinology 1993; 31:183–4. [PubMed] [Google Scholar]
  • 18.Cook KF, Jensen SE, Schalet BD, Beaumont JL, Amtmann D, Czajkowski S, et al. PROMIS measures of pain, fatigue, negative affect, physical function, and social function demonstrated clinical validity across a range of chronic conditions. J Clin Epidemiol 2016; 73:89–102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Remenschneider AK, D’Amico L, Gray ST, Holbrook EH, Gliklich RE, Metson R. The EQ-5D: a new tool for studying clinical outcomes in chronic rhinosinusitis. Laryngoscope 2015; 125:7–15. [DOI] [PubMed] [Google Scholar]
  • 20.EQ-5D-5L User Guide. 2019] Available from https://euroqol.org/publications/user-guides/.
  • 21.Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)--a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform 2009; 42:377–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Terwee CB, Bot SD, de Boer MR, van der Windt DA, Knol DL, Dekker J, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol 2007; 60:34–42. [DOI] [PubMed] [Google Scholar]
  • 23.Koo TK, Li MY. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J Chiropr Med 2016; 15:155–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Hopkins C, Browne JP, Slack R, Lund V, Topham J, Reeves B, et al. The national comparative audit of surgery for nasal polyposis and chronic rhinosinusitis. Clin Otolaryngol 2006; 31:390–8. [DOI] [PubMed] [Google Scholar]
  • 25.Browne JP, Hopkins C, Slack R, Cano SJ. The Sino-Nasal Outcome Test (SNOT): can we make it more clinically meaningful? Otolaryngol Head Neck Surg 2007; 136:736–41. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplementary files

RESOURCES