Abstract
Background/Aims
Gastroesophageal reflux disease is a highly prevalent disease. Assessing treatment efficacy is critical in that clinical endpoints are properly evaluated. Clinical tools for symptoms severity assessment should be discriminative, predictive and evaluative.
Methods
In this study we compared a patient-oriented symptoms evaluation (ReQuest™) vs a structured interview assessment initiated by a physician (sickness impact profile [SIP]). Both questionnaires were analyzed in a multidimensional space using latent factors. Five dimensions were found: 1 for the short ReQuest™ questionnaire and 4 for SIP.
Results
We included 1,522 women and 1,296 men; mean age was 36 ± 7 years, and mean body mass index was 26 ± 4. The score questionnaire assessment evaluation by physicians and patients did not correlate between them (between r = 0.03 and 0.26) except nausea and sleep disorder (r = 0.45 and 0.51) but both were sensitive enough to detect changes after treatment (P < 0.05). Medical specialty of the physician showed effect on the score of both, ReQuest™ and SIP evaluation. Questionnaire variance decomposition due to specialist was only 2% (P < 0.05).
Conclusions
While both evaluations are orthogonal (non-correlated), meaning patients and physicians measured diverse aspects of the same disease, they both were able to measure patient's improvement with treatment.
Keywords: Monitoring, Physiologic; Pantoprazole; Questionnaires
Introduction
Gastroesophageal reflux disease (GERD) is a highly prevalent gastrointestinal disorder. It is critical that clinical endpoints are properly evaluated in order to assess treatment efficacy. Clinical tools for assessment of symptoms severity should be discriminative, predictive and evaluative.1
Any evaluative scale should measure clinical changes (ie, symptoms severity) over the time and under treatment (sensitivity to detect changes); the scales should be objective and interpretable. Very few clinical tools have been developed and validated to evaluate the severity of GERD, and so far, there is no gold standard to determine the responsiveness of a measurement scale in GERD. Despite the subjectivity, GERD questionnaires such as Carlsson-Dent are sensitive enough to detect people with GERD than endoscopy in a region with low prevalence.2 The subjective assessment of symptoms intensity as related to functional disease indicates that self-assessment is more accurate than physician evaluation. This concept is founded on the idea that only the person who suffers the pain knows its intensity, compared to a physician who can make an approximation of the patient's pain. Consequently, some researchers only pay attention to the patient's score of symptoms severity.3
Clinimetrics is a word derived from biometrics, and it refers to measurement of clinical symptoms and procedures which are helpful for diagnosis, characterization or evaluation of clinical entities. Any clinimetric measurement has its own variance, and the source of variation regarding symptoms intensity can be associated with factors like race, sex or age. Additionally, other factors external to patients can influence the results, such as medical specialty of the physician administering the interview.
In this study, we tested 2 hypotheses regarding the clinimetric approach to GERD assessment: (1) symptoms intensity score would correlate between physician and patient measurements. To test this, we compared the physician (from several specialties) measurement using a Likert scale of symptoms intensity with sickness impact profile (SIP) vs patients' measurement with ReQuest™. We assumed that both scales would have high internal reliability and a positive coefficient of correlation. We measured their sensitivity to detect changes in symptoms severity after 4 weeks of pantoprazole magnesium (40 mg orally per day). (2) Physician specialty, patients' disease and characteristics are important source of variation in symptoms intensity scores. We used a variance decomposition of scores to determine whether the amount of information (attributable percentage) of patients and physician specialty contributed to the scores.
Materials and Methods
Study Sample
This was a nation widely conducted study in private, state and hospital-based practices in Mexico. We included subjects between 18 to 45 years old (we excluded older people to decrease the probability of malignancy), with clinical history of heartburn, acid regurgitation, or both during the previous 3 months. These symptoms were reliable indicators of the presence of GERD, as has been found in other questionnaires.4,5 The present study included a wide sample of GERD patients with few clinical restrictions, and this design supported external validity of our results for daily clinical practice.
Questionnaires
We asked physician to conduct a structured interview (SIP) driven by questions that explore symptoms severity associated with GERD. This SIP has 18 questions divided into classical symptoms of GERD (heartburn and regurgitation), dyspepsia and extraesophageal manifestations of GERD. The interview asked patients to rate severity of heartburn, regurgitation, retching, halitosis, flatulence, nausea, sialorrhea, globus, discomfort, chest pain, dyspnea, chronic cough, early satiety and sleeping disturbances. The scale included categories such as "never," "rare," "sometimes," "most of the time" and "always." All of them were scored on a Likert scale of 4 points. We had demonstrated that this structured interview has enough sensitivity to detect group differences for symptoms severity or its changes by proton pump inhibitor treatment.6,7
The patient directed-instrument, ReQuest™, is a validated questionnaire with high internal consistency (Cronbach α = 0.90) and test-retest realiability (intraclass correlation coefficient between 0.86 for short version and 0.94 for long version).8 The short version has 6 questions with a visual analog scale of 10 cm long. The 6 questions vary in scope from general well being (quality of life) to acid-related complaints, upper abdominal-related complaints, lower abdominal-related complaints, nausea and sleep disorders.
Statistical Methods
Data were analyzed with SPSS version 15 and Statistica version 8.0. The internal reliability was measured with Cronbach alpha coefficient,9 calculated for ReQuest™ and SIP separately.
Correlation between Likert scores and the ReQuest™ was calculated by Spearman's test. Dimensionality was assessed by latent factor analysis with a varimax rotation and Kaiser normalization of variables. Those factors with eigenvalue greater than 1 were considered as non compressed and to be able to represent one or more original variables, so they were considered significant because they explained an important amount of the variability in the data. The extracted scores were used as dependent variables.
Differences between specialties were analyzed using analysis of variance (ANOVA) and only those with P < 0.05 were considered for post hoc test with Fisher method (data expressed as mean ± SEM). To understand the source of variance, we performed variance components analysis using type III ANOVA to calculate mean squares and the percentage of explained variance. The specialty of the physician interviewer was considered as a random effect; sex was fixed factor; age and body mass index were considered as covariates.
Ethics
All applicable international regulations concerning the ethical participation of our volunteers were followed during this research. The study and the informed consent document were approved by the Ethical Committee for Research of the Dr. Maximiliano Ruiz Castañeda General Hospital of Naucalpan; and the Center of Bioethics of the Medicine Faculty of the University of Guanajuato, Mexico. All subjects provided informed consent.
Results
Studied Sample
We invited 1,306 physicians to participate from four branches of medicine (gastroenterology, n = 157; internal medicine, n = 218; general surgery, n = 65; and general practitioner, n = 866). They evaluated 3,665 patients and 2,818 (77%) were included in this sample who completed the ReQuest™ questionnaire and were suitable for the analysis. We included 1,522 (54%) women and 1,296 men; their mean age was 36 ± 7 (SD) years, and the mean body mass index was 26 ± 4 (SD).
Internal Consistency and Correlation Between Sickness Impact Profile and ReQuest™
The Cronbach alpha coefficient showed high internal consistency for ReQuest™ (0.87, P < 0.05) and SIP (0.84, P < 0.05). Most of the correlation coefficients between ReQuest™ and SIP were small (between 0.03 and 0.26); only nausea (r = 0.45, P < 0.001) and sleep disorders (r = 0.51, P < 0.001) were highly correlated. Table 1 shows this matrix correlation.
Table 1.
All correlations showed P-value < 0.05.
Dimensionality of Scales
The factor analysis including both measurements showed 5 main factors, and one of these factors (F1) was the ReQuest™ itself. The other four factors can be described as F2: dyspepsia; F3: sleep disorders and cough; F4: symptoms associated with larynx-upper esophagus region such as hoarseness, odynophagia, dysphagia and globus; and F5: classical GERD symptom (heartburn and regurgitation). The obtained factors, calculated by principal component analysis, were orthogonal to each other; which meant by definition that they did not have correlation between calculated scores.
Variance Decomposition of the Scores
The ANOVA found differences in symptoms intensity and ReQuest™ scores by physician specialty (Table 2). Gastroenterologists gave the highest scores for heartburn and regurgitation; surgeons gave the highest scores for symptoms associated with upper respiratory or esophageal symptoms like odynophagia, dysphagia and globus. General practictioners gave the highest scores for other symptoms like flatus and dyspepsia (Table 2). The sex adjustment showed that women had higher scores for symptoms severity than men (data not shown) as has been found in other studies.
Table 2.
a,b,cIndicate homogeneous groups using Scheffe contrast, where b > c > a.
BMI, body mass index; SIP, sickness impact profile.
Data were expressed as mean ± SEM.
The variance decomposition showed the effect of specialty in less than 2% compared to the total variance. Interestingly, despite the lack of correlation between both evaluations, SIP and ReQuest™, showed a clinically and statistically significant response after the 4 weeks pantoprazole treatment (Figure).
Discussion
This study demonstrated that symptoms severity of GERD assessed by physicians did not correlate with symptoms severity assessed by patients. This could be interpreted as assessment of different aspects (2 different points of view of the same patients' symptoms). Interestingly, the physician-administered and patient-administered tools have high internal reliability for measurement of the same disease, but they were non-correlated as we demonstrated orthogonality in a multiple dimensional-mathematical space (latent factor analysis). In other words, they measure diverse dimensions, and one of them is occupied by all ReQuest™ questions, while the other four factors were measured by the physician evaluation with SIP. However, both tools recorded improvement of patients' symptoms after 4 weeks of pantoprazole magnesium treatment, which meant that GERD measured in multidimensional traits improved with treatment.
Clinimetry Should Be Evaluated by Physicians or Patients?
We expected a positive correlation between the questions of the structured interview and ReQuest™, but instead we found that evaluation of SIP and ReQuest™ was orthogonal. Thus, even though both physicians and patients were evaluating the same disease, the dimensions measured were not the same. After treatment, measurements by both tools showed improvement.
Neither of the measurements showed any great advantage over the other in evaluating symptoms severity during treatment. Moreover, the multidimensionality of the evaluation helped to check improvement not only for classical GERD symptoms (heartburn and regurgitation) but also for extraesophageal and overlapping symptoms. Some of the score variations in both tools were due to: perception associated with the medical specialty of the physician and women's perception of symptoms severity. We think that in a larger sample size, the effect of the medical specialty of the participating physicians would diminish the total error variation. For a smaller sample size, the stratification of specialties becomes the rule. Perhaps then the next questions should be, when to evaluate the symptoms, and who should initiate the evaluation: physician or patients?
Advantages of the Method
The methods we used had some interesting components. We invited a large number of centers to participate in the study in order to increase external validity of our results for daily clinical practice.
It has been recommended that patient-driven evaluation of symptoms is the most effective. The logic of this suggestion is based on the subjectivity of evaluation about severity on any functional disease. Patients know better than physicians how they feel.10 We compared 2 tools that should be antagonic from this point of view, a tool filled by patients and another tool filled by physician. The high alpha coefficient support that the structured questions and the ReQuest™ have a coherent structure, the lack of correlation with the orthogonal dimension corroborated by factor analysis, which is consistent witht ReQuest™ being an unidimensional measurement, while the structured interview covers 4 dimensions. We analyzed the variance composition of the latent factor scores and found the medical specialty of the physician influenced the perception of symptoms severity of GERD.
Limitations of the Study
The relationship between ReQuest™ and the structured interview remains unclear; despite the fact that they were uncorrelated, both tools measured improvement with treatment, meaning that they had a communality affected by treatment. We will study further the co-decomposition of variance between these tools.
The factor analysis should be use to rebuild the structured interview. The factor scores obtained are non-dimensional, however, they can be helpful to select a new clusters of questions to evaluate symptoms intensity for GERD.
Conclusions
We conclude that the patient- and physician-driven tools do not correlate because they measure diverse orthogonal dimensions. However, both tools are sensitive enough to detect favorable changes in symptoms severity associated with treatment with pantoprazol magnesium after 4 weeks.
Acknowledgements
Dr. Vargas-Romero conceptualized the study with critical analysis and wrote the paper, Dr. Lopez-Alvarenga and Dr. Sobrino-Cossio analyzed and wrote the paper. Dr. Ronnie Fass made critical review of the manuscript and included structural modifications to the paper.
Footnotes
Financial support: None.
Conflicts of interest: Nycomed Mexico SA de CV supported this study. Competing Interests: Dr. Vargas-Romero is Medical Director of Nycomed SA de CV. Dr. Lopez-Alvarenga is Biometrical advisor of Nycomed SA de CV.
References
- 1.Stanghellini V, Armstrong D, Mönnikes H, Bardhan KD. Systematic review: Do we need a new gastro-oesophageal reflux disease questionnaire? Aliment Pharmacol Ther. 2004;19:463–479. doi: 10.1046/j.1365-2036.2004.01861.x. [DOI] [PubMed] [Google Scholar]
- 2.Netinatsunton N, Attasaranya S, Ovartlarnporn B, Sangnil S, Boonviriya S, Piratvisuth T. The Value of Carlsson-dent questionnaire in diagnosis of gastroesophageal reflux disease in area with low prevalence of gastroesophageal reflux disease. J Neurogastroenterol Motil. 2011;17:164–168. doi: 10.5056/jnm.2011.17.2.164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Irvine EJ, Feagan BG, Wong CJ. Does self-administration of a quality of life index for inflammatory bowel disease change the results? J Clin Epidemiol. 1996;49:1177–1185. doi: 10.1016/0895-4356(96)00136-9. [DOI] [PubMed] [Google Scholar]
- 4.Locke GR, Talley NJ, Weaver AL, Zinsmeister AR. A new questionnaire for gastroesophageal reflux disease. Mayo Clin Proc. 1994;69:539–547. doi: 10.1016/s0025-6196(12)62245-9. [DOI] [PubMed] [Google Scholar]
- 5.Moreno Elola-Olaso C, Rey E, Rodríguez-Artalejo F, Locke GR, 3rd, Díaz-Rubio M. Adaptation and validation of a gastroesophageal reflux questionnaire for use on a Spanish population. Rev Esp Enferm Dig. 2002;94:745–758. [PubMed] [Google Scholar]
- 6.Lopez-Colombo A, Lopez-Alvarenga JC, Vargas J, et al. Do prior pregnancies modify the intensity of symptoms related to GERD? A report of the Mexican GERD working group. Gut. 2008;57(suppl 2):A312. [Google Scholar]
- 7.Lopez L, Lopez-Alvarenga JC, Comuzzie AG, Gonzalez J, Crespo Y, Vargas J. Nighttime GERD symptoms associated with dyspepsia, esophageal discomfort, and extraesophageal complaints improvement after treatment with Pantoprazole magnesium; 40 mg daily for 28 days. Gastroenterology. 2008;134(suppl 1):A176–A177. [Google Scholar]
- 8.Mönnikes H, Bardhan KD, Stanghellini V, Berghöfer P, Bethke TD, Armstrong D. Evaluation of GERD symptoms during therapy. Part II. Psychometric evaluation and validation of the new questionnaire ReQuest in erosive GERD. Digestion. 2004;69:238–244. doi: 10.1159/000079708. [DOI] [PubMed] [Google Scholar]
- 9.Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika. 1951;16:297–334. [Google Scholar]
- 10.Brandt LJ, Bjorkman D, Fennerty MB, et al. Systematic review on the management of irritable bowel syndrome in North America. Am J Gastroenterol. 2002;97(11 suppl):S7–S26. doi: 10.1016/s0002-9270(02)05657-5. [DOI] [PubMed] [Google Scholar]