Abstract
Objective:
Psychometric studies evaluating the reliability of eating disorder assessment among individuals with binge-eating disorder (BED) are limited. The current study documents the interrater reliability and internal consistency of the Eating Disorder Examination (EDE) interview when administered to adults with DSM-5-defined BED.
Method:
Participants (N=56) were adults seeking treatment for BED in the context of clinical trials testing pharmacological and psychological treatments. Doctoral-level, trained, and supervised clinical researchers evaluated eating-disorder psychopathology using the EDE interview and audio-recorded the interview. A second doctoral-level, trained, and supervised clinical researcher, who did not conduct the initial assessment, coded eating-disorder psychopathology using the audio recording.
Results:
Agreement between raters on the number of binge-eating episodes was near perfect. There was excellent interrater reliability for nearly all scales of the EDE interview. Agreement between raters for behavioral indicators of loss of control and marked distress about binge eating ranged from moderate to perfect. Internal consistency was variable for all scales, ranging from unacceptable to good.
Discussion:
Our study suggests that the EDE can be administered reliably by multiple interviewers to assess adults with BED. However, internal consistency was mostly subpar. Tests of reliability and other psychometric properties (e.g., validity) in other patient groups, such as children with BED, are warranted.
Keywords: binge-eating disorder, psychometrics, eating disorder, assessment, diagnosis
Introduction
Binge-eating disorder (BED) is the most prevalent formal eating disorder (1) and characterized by the sense of losing control while overeating, accompanied by distress, but not accompanied by the weight-compensatory behaviors that characterize bulimia nervosa (i.e., purging) (2, 3). While obesity is associated strongly with BED, particularly in treatment-seeking samples, BED is distributed across all weight categories in epidemiological studies (4). Assessment of eating-disorder psychopathology can be challenging because it requires accurate assessment of complex thoughts, behaviors, and intents behind behaviors, and because many eating disorder symptoms are associated with strong affect and distress. Assessment, however, is an important component of diagnosis and clinical treatment planning, as well as in clinical research, particularly as the presence or absence of a single behavior or understanding of the intent behind a behavior could change the diagnosis. To date, research has established the utility of using eating disorder instruments such as the Eating Disorder Examination (EDE) interview to assess eating-disorder psychopathology among individuals across eating disorder diagnoses, including BED, as well as individuals across the weight spectrum, including overweight/obesity (5–9). However, these studies were conducted using the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) research criteria for BED (10). To maintain the rigor of research conducted with the EDE and ensure reproducible results, the reliability of this instrument needs to be established among adults meeting current Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) diagnostic criteria for BED (2).
The EDE is an investigator-based, semi-structured interview that assesses eating-disorder psychopathology (11). The EDE, which was developed originally to assess individuals with anorexia nervosa and bulimia nervosa (12), assesses eating-disorder psychopathology and the frequency of binge-eating, overeating, and extreme weight-compensatory behaviors (e.g., purging) (13). Clinically, the standard EDE interview was hypothesized to comprise four subscales as well as a global (total) severity score of eating-disorder psychopathology: dietary restraint, eating concern, weight concern, and shape concern. However, empirical work using factor-analytic methods has supported, consistently, a briefer alternative scoring method (7 items total: dietary restraint [3 items], overvaluation of shape/weight [2 items], and body dissatisfaction [2 items]) (14–17).
Significant clinical training is necessary to administer the EDE. Research across various community and patient groups examining interrater reliability has generally suggested good reliability can be achieved (5). Interrater reliability is a metric that applies to interviewers and reflects agreement across multiple interviewers in item coding (frequency, severity), including items that are used to establish diagnoses. In patients with DSM-IV-defined BED, interrater reliability across EDE subscales was excellent in one study (>.91) (8) and good in another study (>.83) (18), which is equal to or superior to interrater reliability for other patient groups (5, 19, 20). Internal consistency of the EDE scales in patients with DSM-IV-defined BED, however, has been suboptimal. Internal consistency is a metric that applies to scales more than to interviewers and reflects how well (or how poorly) items belonging to the same subscale (capturing a specific construct) are answered in a consistent pattern. In a large study that aggregated data across three sites conducting treatment studies for individuals with BED, internal consistency ranged from .51 to .68 (15). In a study that examined psychometric properties of the EDE interview in African American and White patients with BED, internal consistency ranged from .51 to .82 (7). Internal consistency for individuals with BED was similar to internal consistency for other clinical and non-clinical groups with the exception of weight concern, which was lower than other groups (5). These studies all included individuals with overweight/obesity as well as BED. Prior work has noted challenges in eating-disorder assessment among patients with elevated weight (21, 22), including those who have had bariatric surgery (23, 24). Dietary restraint demonstrated particularly poor psychometric qualities in those studies, which might reflect that the construct focuses on intent (i.e., “whether or not you succeeded”), and might also reflect that the dietary restraint subscale includes some items that could be classified as non-disordered weight management strategies (e.g., “Over the past four weeks have you been consciously trying to restrict (cut back) the overall amount that you eat, whether or not you have succeeded?”) but other items could be classified as more extreme weight loss strategies (e.g., “Over the past four weeks have you wanted your stomach to be [completely] empty?”).
The limited work that has been conducted on the EDE interview in the context of the DSM-5 (23, 24) has not established the reliability of the interview for patients meeting DSM-5 criteria for BED. Therefore, the purpose of the current study was to provide an update of the reliability of the EDE in patients with DSM-5-defined BED and additionally to provide new information about the reliability of the EDE using the brief alternative scoring method (15) and the reliability of key diagnostic items, including behavioral indicators of loss of control and marked distress. Understanding the reliability of the EDE will help inform decisions about whether to use the instrument in clinical research with this patient group and will provide context for the interpretation of scores. Specifically, among individuals meeting criteria for DSM-5-defined BED, we tested two forms of reliability: interrater reliability and internal consistency of the total score and subscales. We tested these forms of reliability using the original and brief (15) scoring methods of the EDE interview. Additionally, we tested interrater reliability of DSM-5 specific diagnostic items (binge-eating episodes, behavioral indicators of loss-of-control, and marked distress).
Method
Procedure
Participants (N=56) responded to advertisements for treatment studies testing a variety of psychological and pharmacological treatments for BED conducted at an urban, medical-school based program located in the northeastern United States. The treatment studies, including the procedures described here, received approval from the university’s ethical review board (IRB). All participants provided written informed consent, including consent for audio-taping the interviews.
The EDE interview was administered as part of a comprehensive initial evaluation. All research clinicians had doctoral training in psychology, were trained by investigators in the EDE, and were supervised to optimize reliability and prevent drift. Training included didactic explanations of the standardized interview procedures and descriptions of the latent construct for each EDE item. This was followed by each trainee observing and co-coding at least two interviews conducted by an expert assessor, followed by discussions of any coding discrepancies. Subsequently, each trainee conducted at least one EDE interview with an expert assessor present and co-coding, again followed by discussions of any discrepancies. Once reliability was established by the training process, research clinicians conducted EDE interviews independently while receiving ongoing supervision by investigators that involved discussions of how to code any unclear items, as well as ongoing monitoring of audio-recorded interviews. The co-coding conducted during the training process was not included in the current study of interrater reliability.
During the baseline assessment, participants were evaluated by research clinicians. To be included in the current study, participants had to meet DSM-5 criteria for BED and have an audio-recorded EDE. A series of participants meeting these eligibility criteria were randomly selected using an online generator for the current study; their EDE interviews were coded by a second research clinician. The second research clinician was unaware of the first research clinician’s scoring of the EDE.
Participants
Participants were between 18 and 70 years old and seeking treatment for DSM-5-defined BED and with BMI > 21.5 kg/m2 and < 50.0 kg/m2. Participants were excluded prior to baseline for serious medical conditions (e.g., cancer), severe mental illness that could interfere with clinical assessment (e.g., psychosis), concurrent treatment for eating/weight concerns, or pregnancy. Table 1 summarizes participants’ sociodemographic characteristics.
Table 1.
Demographic characteristics of participants
| n | % | M | SD | |
|---|---|---|---|---|
| Gender | ||||
| Male | 13 | 23.2% | ||
| Female | 42 | 75.0% | ||
| Transgender | 1 | 1.8% | ||
| Race | ||||
| White or Caucasian | 42 | 75.0% | ||
| Asian or Asian American | 2 | 3.6% | ||
| Black or African American | 6 | 10.7% | ||
| Multiracial | 6 | 10.7% | ||
| Ethnicity | ||||
| Hispanic/Latinx | 10 | 17.9% | ||
| Not Hispanic/Latinx | 46 | 82.1% | ||
| Education | ||||
| High school or less | 11 | 19.6% | ||
| Some college | 10 | 17.9% | ||
| College degree | 15 | 26.8% | ||
| More than college | 18 | 32.1% | ||
| Age (years) | 56 | 42.84 | 12.7 | |
| Body Mass Index (kg/m2) | 56 | 36.29 | 5.2 |
Measures
Body Mass Index (BMI).
Participants’ height and weight were measured. Research clinicians calculated BMI (kg/m2).
Eating Disorder Examination (EDE).
The EDE evaluates eating-disorder psychopathology in the past 28 days, and over longer intervals as items correspond with diagnostic criteria (11). All subscale and total scores range from 0 through 6, with higher scores indicating more psychopathology. The interview also assesses objective binge-eating episodes (OBEs, perceiving a loss of control while overeating an unusually large amount of food; this corresponds to the DSM-5-based definition of binge-eating episodes), as well as other criteria required for the DSM-5 diagnosis: five behavioral indicators of loss of control (eating much more rapidly than normal, eating until feeling uncomfortably full, eating large amounts of food when not feeling physically hungry, eating alone because of feeling embarrassed, and feeling disgusted, depressed, or guilty after eating; rated as present or absent) and “marked distress” about binge eating (rated on a scale from 1=no distress to 5=very distressed).
Statistical Analyses
Continuous items were first examined for adherence to normality and found to satisfy this assumption. Intra-class correlation coefficients (ICCs) with 2-way random effects examined interrater reliability for the original EDE subscales (Dietary Restraint, Eating Concern, Weight Concern, Shape Concern) and the corresponding global severity score, as well as for the brief EDE subscales (Restraint, Overvaluation, and Body Dissatisfaction). ICCs are considered poor at <.50, moderate at .50–.75, good at .75–.90, and excellent at >.90 (25, 26).
Kappas measured the agreement between two raters for the number of binge-eating episodes (frequency over the past 28 days), behavioral indicators of loss of control, and distress. Agreement is considered slight at kappa <.20, fair at .21–.40, moderate at .41–.60, substantial at .61–.80, and near perfect at >.80 (27).
Cronbach’s alpha examined internal consistency for the original EDE subscales, global severity score, and the brief EDE subscales. Internal consistency is considered to be unacceptable at Cronbach’s alpha <.50, poor at .51–.60, questionable at .61–.70, acceptable at .71–.80, good at .81–.90, and excellent at >.90 (28).
Results
Overall, participants had an average of 16 binge-eating episodes (SD=11.52) in the 28 days prior to assessment. The average BMI of participants was 36.29 kg/m2 (SD=5.17; range=23.77 through 44.88 kg/m2). The mean global severity score on the EDE was 2.48 (SD=0.89). Table 2 summarizes descriptive data for the EDE subscales for both the original and brief scoring methods, as well as binge-eating episodes, behavioral indicators of loss of control, and distress. All intra-class correlations were in the “excellent” range (>.90), except for the Body Dissatisfaction subscale of the EDE brief version, which was just under the threshold in the “good” range (ICC=.89). Kappa statistics were calculated for single items and were in the “near perfect” range (>.80) for binge-eating episodes, eating more rapidly than usual, eating until uncomfortably full, and feeling disgust, depression, or guilt after an episode. Agreement was “substantial” for eating alone due to embarrassment and “moderate” for eating in the absence of hunger and marked distress about binge eating.
Table 2.
Descriptive statistics, interrater reliability and internal consistency of the EDE scales and BED items
| N | M | SD | ICC | Kappa | Interrater Reliability Interpretation |
Cronbach’s α | Internal Consistency Interpretation |
|
|---|---|---|---|---|---|---|---|---|
| Binge-eating episodes | 55 | 16.42 | 11.52 | 0.83 | Near Perfect | |||
| Behavioral Indicators | ||||||||
| Eating Rapidly | 55 | 0.69 | 0.47 | 0.96 | Near Perfect | |||
| Uncomfortably Full | 55 | 0.91 | 0.29 | 1.00 | Perfect | |||
| Absence of Hunger | 55 | 0.96 | 0.19 | 0.48 | Moderate | |||
| Eating Alone | 55 | 0.78 | 0.42 | 0.76 | Substantial | |||
| Disgust, Depression, Guilt | 55 | 0.98 | 0.14 | 1.00 | Perfect | |||
| Marked Distress | 54 | 3.91 | 0.62 | 0.56 | Moderate | |||
| Original EDE | ||||||||
| Global Score | 56 | 2.48 | 0.89 | 0.98 (0.97 – 0.99) | Excellent | 0.86 | Good | |
| Dietary Restraint | 56 | 1.69 | 1.14 | 0.97 (0.95 – 0.98) | Excellent | 0.49 | Unacceptable | |
| Eating Concern | 56 | 1.82 | 1.26 | 0.97 (0.95 – 0.98) | Excellent | 0.66 | Questionable | |
| Weight Concern | 56 | 3.01 | 1.00 | 0.96 (0.94 – 0.98) | Excellent | 0.54 | Poor | |
| Shape Concern | 56 | 3.40 | 1.13 | 0.99 (0.98 – 0.99) | Excellent | 0.76 | Acceptable | |
| Brief EDE | ||||||||
| Global Score | 56 | 3.32 | 1.12 | 0.97 (0.94 – 0.98) | Excellent | 0.68 | Questionable | |
| Restraint | 56 | 2.50 | 1.71 | 0.97 (0.96 – 0.98) | Excellent | 0.52 | Poor | |
| Overvaluation | 56 | 3.36 | 1.75 | 0.96 (0.93 – 0.98) | Excellent | 0.88 | Good | |
| Body Dissatisfaction | 56 | 4.51 | 1.12 | 0.89 (0.82 – 0.94) | Good | 0.65 | Questionable |
Means (M), standard deviations (SD), and Cronbach’s alpha are from the first rater’s coding. Intra-class correlations (ICCs) and kappa statistics compare the first rater’s and second rater’s coding. EDE = Eating Disorder Examination interview; OBE = objective binge-eating episode. ICCs are considered poor at <.50, moderate at .50–.75, good at .75–.90, and excellent at >.90. Agreement is considered slight at kappa <.20, fair at .21–.40, moderate at .41–.60, substantial at .61–.80, and near perfect at >.80. Cronbach’s alpha is considered unacceptable at <.50, poor at .51–.60, questionable at .61–.70, acceptable at .71–.80, good at .81–.90, and excellent at >.90.
Internal consistency alphas were calculated for the original and brief (15) methods of scoring subscales for the EDE. For the original EDE global score and subscales, internal consistency was “good” for the global score, “acceptable” for Shape Concern, “questionable” for Eating Concern, “poor” for Weight Concern, and “unacceptable” for Dietary Restraint. For the brief EDE global score and subscales, internal consistency was “good” for overvaluation, “questionable” for the global score and Body Dissatisfaction, and “poor” for Restraint.
Discussion
The current study describes the reliability of the EDE interview administered to individuals who were evaluated in a baseline assessment for treatment studies for DSM-5-based BED. Establishing the reliability of the EDE interview among patients with BED across the weight continuum is important for accuracy in clinical research and appropriate care in clinical settings. Reliability is important because the EDE can determine an eating-disorder diagnosis, as well as characterize the associated psychopathology, which should guide eligibility criteria for research studies and treatment planning in clinical settings. Reliable instruments also minimize error in repeated assessments, thereby capturing change over time more accurately (e.g., change in binge-eating episode frequency from baseline to end of treatment). There was excellent agreement among raters for nearly all subscales of the EDE using both scoring methods (original and brief (15)), and one subscale was just under the threshold in the “good” range. These findings make an important contribution to the literature documenting the psychometric properties of the EDE. Broadly speaking, our findings support earlier literature that showed the feasibility of multiple assessors administering the EDE interview reliably to patients with eating disorders, including BED (5, 7, 8, 18). Our findings support that the EDE interview can be administered reliably by different assessors to individuals with DSM-5 defined BED. Our findings also extend the literature by demonstrating that the brief alternative scoring method of the EDE (15) also has strong interrater reliability. The reliability of the brief EDE interview is important because the original EDE interview is time-intensive for clinicians, and a brief version of the interview could allow the instrument to have greater accessibility in broader clinical and clinical-research settings. Two other studies have examined interrater reliability of the brief EDE interview; those studies were conducted with individuals pre- or post-bariatric surgery and had similar findings to our study (23, 24).
Kappa for binge-eating episodes was near perfect, which provides confidence for the assessment of this critical variable, which is the core behavioral feature of DSM-5-defined BED, and which is often the primary outcome of prevention and treatment research on BED (29, 30). Our study provides new information on the interrater reliability of behavioral indicators and marked distress, which are required criteria in the DSM-5 diagnosis for BED. Kappa statistics for the five behavioral indicator items and the rating of distress were all at least moderate. While “moderate” and higher interrater reliability is encouraging for the use of these items, it is possible that these items could be improved by additional guidance or training to remove some subjectivity or uncertainty. For example, anchoring the rating of distress in terms of whether it warrants clinical attention because of impairment could improve agreement between interviewers. Likewise, continuing supervision that is anchored in guidance for the instrument could prevent uncertainty or drift, particularly for ambiguous items.
In contrast to the strong interrater reliability was the internal consistency of the subscales. As in the literature (5, 7, 15), the internal consistency of subscales was variable. In our study, in the full EDE, the global score was “good,” but one subscale was “questionable,” one was “poor,” and one was “unacceptable.” In a systematic review of EDE psychometrics that included internal consistency, alphas ranged from “unacceptable” to “good” (5). In our study, in the brief EDE, internal consistency of subscales was not much better; one was “good,” but one was “questionable,” and another was “poor.” The dietary restraint subscale was particularly poor in both the full and brief forms in our study; in the literature, internal consistency of dietary restraint ranged from 0.58 (“poor”) through 0.78 (“acceptable”), with the lowest alpha among participants with obesity (5). In previous work, internal consistency was marginally better in patient groups with a greater composition of patients with anorexia nervosa and bulimia nervosa (i.e., those with more “restrictive” eating disorders), but still variable (5). While this indicates that it is challenging to use the dietary restraint subscale with patients with BED, the field needs more research to disentangle and better measure the complex constructs that underlie various dietary restraint constructs and their significance with regard to both binge eating and weight (22, 31). As with previous work, only the global severity score and Shape Concern subscale in the full EDE, and overvaluation of weight/shape in the brief EDE, had internal consistency that was at least acceptable (5, 8, 14). Our study extends prior work by adding information about the internal consistency of the brief EDE interview subscales. The sole other study that examined the brief EDE interview reported similar internal consistency of subscales (23). These internal consistency data suggest that clinical research, particularly with repeated measures that rely on subscales, may want to consider alternative interview measures or complementary measures.
Despite the similarity of our findings with those from the literature, weak internal consistency is problematic. Internal consistency measures the extent to which items grouped together on a scale or subscale are answered in a similar manner by participants. Weak internal consistency might be indicative of multiple constructs included on the scale or complexities with the construct being measured. Complexities can be related to heterogeneity in individual experiences, or differences in other (not measured) attributes that influence the construct of interest. In the current study, this may mean that participants with BED had variable experiences with eating, weight, and body image, rather than a more uniform experience. Internal consistency is also related to the number of items on a scale, the amount of redundancy across items, sample size, item wording (e.g., double-barreled language) and response patterns (e.g., restricted ranges). The pattern of our findings is somewhat related to the number of items on the scale, with scales that have higher numbers of items also having more internally consistent scores. However, the brief EDE interview subscales (but not the global score) outperformed the original EDE interview subscales, albeit marginally, in internal consistency despite having fewer items. This may indicate that the brief EDE interview subscales had more focused constructs than the more complex original EDE interview subscales.
Given the broad range of biopsychosocial correlates of eating-disorder psychopathology and emerging data on patients and psychopathology from latent profile analysis (32) and network analysis (33), it is likely that the complexities of measuring eating-disorder psychopathology produce response patterns across patients that are not accounted for by diagnostic criteria or the EDE items and subscales. Internal consistency of the dietary restraint subscale was particularly problematic using both the original and brief scoring methods. The EDE conceptualizes and measures dietary restraint in a manner more consistent with highly “restrictive” manifestations of eating disorders (e.g., anorexia nervosa and bulimia nervosa), because this was the clinical population the measure was initially designed to assess. However, the EDE items that comprise the dietary restraint subscale include both non-disordered weight management behaviors and more extreme weight loss behaviors, and is only one method of measuring dietary restraint in the literature (31, 34). It is possible that the more extreme dietary restraint as measured by the EDE is both less common and less clinically-relevant in BED than other eating disorders. Indeed, dietary restraint subscale scores are typically significantly lower in patients with BED than in those with bulimia nervosa (35) and internal consistency of EDE subscales trend lower in studies with BED than in those with anorexia nervosa and bulimia nervosa (5). Finally, heterogeneity is likely, both within and between diagnostic groups and weight levels, in how patients think about or act related to restricting their eating. Future research, including research that tests alternative items for the restraint subscale and uses qualitative strategies to explore how patients choose to respond to these times, could further validate and improve upon current assessment instruments.
Taking together the findings from our study as well as the broader literature, the use of the EDE interview (both original and brief) is supported for clinical research and practice, but with some caution. Considering the variable internal consistency, it appears prudent to use the global score when using the original scoring method, rather than subscale scores, and to use the overvaluation subscale score when using the brief scoring method, whenever possible. These were the only two scores with “good” internal consistency in our study. More research is needed to explore why internal consistency may be low in this patient population, and to compare the reliability of the EDE in this patient population across other groups with different clinical characteristics (e.g., those with bulimia nervosa and elevated weight). It is also essential for researchers to continue to report internal consistency for transparency. Additionally, the interrater reliability of the EDE interview, while excellent in the current study and in the literature, was examined (in the present study and most previous published studies) among highly-trained and closely-monitored research clinicians. While this practice enhances the rigor of research findings and allows more confidence in the detection of changes over time, the reliability of assessors conducting the EDE interview when they have not been trained nor monitored with similar procedures is unknown. Further evidence of interrater reliability across levels of expertise and instrument-specific training is needed to understand the generalizability of findings and the feasibility of disseminating EDE interviews to clinicians in the community. However, given the evidence from the current study that the brief alternative scoring method of the EDE interview (15) has strong reliability, and other studies supporting superior psychometric properties of the alternative brief scoring method (14–17), a version of the EDE interview has less of a burden on patients and requires less time to administer is promising and worth further investigation.
The participant group for the current study is both a strength and limitation. Our findings extend the literature to treatment-seeking patients with DSM-5-defined BED. Strengths of the present study were the use of expert raters, DSM-5-defined BED, examination of two forms of reliability, and examination of the original and brief structures of the EDE interview. However, our findings may not generalize to individuals who do not seek treatment, who do not seek treatment in the context of clinical research, or those from other geographical locations. Likewise, although our study had some diversity, findings may not generalize to patient groups that were underrepresented in our study, including ethnic/racial minorities and gender minorities, and may not generalize to patient groups that were excluded from the current study, including those with serious medical conditions, youth, and older adults.
In conclusion, the EDE interview remains a reliable instrument for the assessment of eating-disorder psychopathology among individuals with DSM-5-defined BED. The original and brief alternative (15) scoring methods of the EDE interview both have strong interrater reliability but variable internal consistency. Further research is needed to understand and improve upon the internal consistency of scores. Clinical research examining changes in eating-disorder psychopathology longitudinally, in response to prevention efforts, and in response to treatments, can use the EDE interview confidently with multiple raters but should consider using the global severity score rather than subscale scores (with the exception of overvaluation, which also had “good” internal consistency) and should report internal consistency of scores when publishing findings.
Study Importance.
What is already known about this subject?
Reliability is particularly important in clinical research that has multiple interviewers and repeated assessments, such as clinical trials.
Preliminary data support the use of the Eating Disorder Examination interview in DSM-IV-defined binge-eating disorder.
What are the new findings in your manuscript?
In patients with DSM-5-defined binge-eating disorder across BMI levels, results demonstrate excellent interrater reliability of the original and alternative brief scales of the Eating Disorder Examination interview as well as specific diagnostic items.
Results show variable internal consistency of the subscale scores.
How might your results change the direction of research or the focus of clinical practice?
The Eating Disorder Examination interview is a reliable instrument to be used by multiple interviewers in the assessment of adults with binge-eating disorder.
Researchers are cautioned that the global (original) score and the overvaluation of shape/weight (brief) score had better internal consistency than other subscales and should be used in lieu of others whenever possible.
Funding:
This research was supported, in part, by National Institutes of Health grant K23 DK115893, R01 DK49587, R01 DK114075, and R01 DK112771. Funders played no role in the content of this paper.
Footnotes
Data Sharing: De-identified data will be provided in response to reasonable written request to achieve goals in an approved written proposal.
Disclosure: The authors declare no conflict of interest relevant to this article. Dr. Lydecker reports broader interests, which did not influence this research, including honoraria for lectures at universities and scientific conferences and participation in review panels. Dr. Grilo reports broader interests, which did not influence this research, including honoraria for CME activities and lectures at scientific conferences, and royalties from Guilford Press and Taylor & Francis Publishers for academic books.
References
- 1.Udo T, Grilo CM. Prevalence and correlates of DSM-5-defined eating disorders in a nationally representative sample of US adults. Biol Psychiatry 2018;84:345–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders 5th ed. Arlington, VA: American Psychiatric Publishing; 2013. [Google Scholar]
- 3.Stein DJ, Szatmari P, Gaebel W, Berk M, Vieta E, Maj M, et al. Mental, behavioral and neurodevelopmental disorders in the ICD-11: an international perspective on key changes and controversies. BMC Med 2020;18(1):21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Udo T, Grilo CM. Psychiatric and medical correlates of DSM-5 eating disorders in a nationally representative sample of adults in the United States. Int J Eat Disord 2019;52:42–50. [DOI] [PubMed] [Google Scholar]
- 5.Berg KC, Peterson CB, Frazier P, Crow SJ. Psychometric evaluation of the Eating Disorder Examination and Eating Disorder Examination-Questionnaire: A systematic review of the literature. Int J Eat Disord 2012;45:428–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Berg KC, Peterson CB, Frazier P, Crow SJ. Convergence of scores on the interview and questionnaire versions of the Eating Disorder Examination: A meta-analytic review. Psychol Assess 2011;23:714–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Lydecker JA, White MA, Grilo CM. Black patients with binge-eating disorder: Comparison of different assessment methods. Psychol Assess 2016;28:1319–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Grilo CM, Masheb RM, Lozano-Blanco C, Barry DT. Reliability of the Eating Disorder Examination in patients with binge eating disorder. Int J Eat Disord 2004;35:80–5. [DOI] [PubMed] [Google Scholar]
- 9.Barnes RD, Masheb RM, White MA, Grilo CM. Comparison of methods for identifying and assessing obese patients with binge eating disorder in primary care settings. Int J Eat Disord 2011;44:157–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders Fourth, Text Revision ed. Washington, DC: Author; 2004. [Google Scholar]
- 11.Fairburn CG, Cooper Z. The Eating Disorder Examination. In: Fairburn CG, Wilson GT, editors. Binge Eating: Nature, Assessment, and Treatment New York: Guilford Press; 1993. p. 317–60. [Google Scholar]
- 12.Cooper Z, Fairburn C. The Eating Disorder Examination - a Semistructured Interview for the Assessment of the Specific Psychopathology of Eating Disorders. International Journal of Eating Disorders 1987;6(1):1–8. [Google Scholar]
- 13.Fairburn CG, Cooper Z, O’Connor ME. Eating Disorder Examination (Edition 16.0D). In: Fairburn CG, editor. Cognitive Behavior Therapy and Eating Disorders New York: Guilford Press; 2008. p. 265–308. [Google Scholar]
- 14.Grilo CM, Reas DL, Hopwood CJ, Crosby RD. Factor structure and construct validity of the eating disorder examination‐questionnaire in college students: Further support for a modified brief version. Int J Eat Disord 2015;48(3):284–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Grilo CM, Crosby RD, Peterson CB, Masheb RM, White MA, Crow SJ, et al. Factor structure of the eating disorder examination interview in patients with binge-eating disorder. Obesity 2010;18:977–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Machado PPP, Grilo CM, Crosby RD. Replication of a modified factor structure for the Eating Disorder Examination-Questionnaire: Extension to clinical eating disorder and non-clinical samples in Portugal. Eur Eat Disord Rev 2018;26:75–80. [DOI] [PubMed] [Google Scholar]
- 17.Grilo CM, Henderson KE, Bell RL, Crosby RD. Eating disorder examination-questionnaire factor structure and construct validity in bariatric surgery candidates. Obes Surg 2013;23:657–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Wilfley DE, Welch RR, Stein RI, Spurrell EB, Cohen LR, Saelens BE, et al. A randomized comparison of group cognitive-behavioral therapy and group interpersonal psychotherapy for the treatment of overweight individuals with binge-eating disorder. Arch Gen Psychiatry 2002;59:713–21. [DOI] [PubMed] [Google Scholar]
- 19.Grilo CM, Crosby RD, White MA. Spanish-language Eating Disorder Examination interview: Factor structure in Latino/as. Eat Behav 2012;13:410–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Grilo CM, Lozano C, Elder KA. Inter-rater and test-retest reliability of the Spanish language version of the eating disorder examination interview: clinical and research implications. J Psychiatr Pract 2005;11(4):231–40. [DOI] [PubMed] [Google Scholar]
- 21.House ET, Lister NB, Seidler AL, Li H, Ong WY, McMaster CM, et al. Identifying eating disorders in adolescents and adults with overweight or obesity: A systematic review of screening questionnaires. Int J Eat Disord 2022;55(9):1171–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.McMaster CM, Paxton SJ, Maguire S, Hill AJ, Braet C, Seidler AL, et al. The need for future research into the assessment and monitoring of eating disorder risk in the context of obesity treatment. Int J Eat Disord 2023;56(5):914–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ivezaj V, Kalarchian MA, King WC, Devlin MJ, Mitchell JE, Crosby RD. Interrater reliability and internal consistency of the eating disorder examination in the longitudinal assessment of bariatric surgery study. Surg Obes Relat Dis 2022;18(8):1015–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Wiedemann AA, Ivezaj V, Lawson JL, Lydecker JA, Cooper Z, Grilo CM. Interrater reliability of the Eating Disorder Examination among postbariatric patients. Surg Obes Relat Dis 2020;16(12):1988–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Koo TK, Li MY. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J Chiropr Med 2016;15(2):155–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Erratum to “A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research” [J Chiropr Med 2016;15(2):155–163]. J Chiropr Med 2017;16(4):346. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Viera AJ, Garrett JM. Understanding interobserver agreement: The kappa statistic. Fam Med 2005;37:360–3. [PubMed] [Google Scholar]
- 28.Tavakol M, Dennick R. Making sense of Cronbach’s alpha. Int J Med Educ 2011;2:53–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Grilo CM, White MA, Masheb RM, Ivezaj V, Morgan PT, Gueorguieva R. Randomized Controlled Trial Testing the Effectiveness of Adaptive “SMART” Stepped-Care Treatment for Adults With Binge-Eating Disorder Comorbid With Obesity. Am Psychol 2020;75(2):204–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Grilo CM, White MA, Ivezaj V, Gueorguieva R. Randomized Controlled Trial of Behavioral Weight Loss and Stepped Care for Binge-Eating Disorder: 12-Month Follow-up. Obesity 2020;28:2116–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Blomquist KK, Grilo CM. Predictive significance of changes in dietary restraint in obese patients with binge eating disorder during treatment. Int J Eat Disord 2011;44(6):515–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Mitchell JE, Crosby RD, Wonderlich SA, Hill L, le Grange D, Powers P, et al. Latent profile analysis of a cohort of patients with eating disorders not otherwise specified. International Journal of Eating Disorders 2007;40(S3):S95–S8. [DOI] [PubMed] [Google Scholar]
- 33.Wang SB, Jones PJ, Dreier M, Elliott H, Grilo CM. Core psychopathology of treatment-seeking patients with binge-eating disorder: a network analysis investigation. Psychological medicine 2019;49(11):1923–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Williamson DA, Martin CK, York-Crowe E, Anton SD, Redman LM, Han H, et al. Measurement of dietary restraint: validity tests of four questionnaires. Appetite 2007;48(2):183–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Grilo CM, Crosby RD, Masheb RM, White MA, Peterson CB, Wonderlich SA, et al. Overvaluation of shape and weight in binge eating disorder, bulimia nervosa, and sub-threshold bulimia nervosa. Behav Res Ther 2009;47:692–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
